\(\mu\text{TC}\) Command Line Interface¶

For any given text classification task, \(\mu\text{TC}\) will try to find a suitable text model from a set of possible models defined in the configuration space using the command:

microTC-params -k3  -Smacrorecall -s24 -n24 user-profiling.json -o user-profiling.params

the parameters means for:

user-profiling.json is database of exemplars, one json-dictionary per line with text and klass keywords
-k3 three folds
-s24 specifies that the parameter space should be sampled in 24 points and then get the best among them
-n24 let us specify the number of processes to be launch, it is a good idea to set -s as a multiply of -n.
-o user-profiling.params specifies the file to store the configurations found by the parameter selection process, in best first order
-S or –score the name of the fitness function (e.g., macrof1, microf1, macrorecall, accuracy, r2, pearsonr, spearmanr)
-H makes b4msa to perform last hill climbing search for the parameter selection, in many cases, this will produce much better configurations (never worst, guaranteed)
all of these parameters have default values, such as no arguments are needed

Notes: - “text” can be a string or an array of strings, in the last case, the final vector considers each string as independent strings. - there is no typo, we use “klass” instead of “class” because of oscure historical reasons - -k accepts an a:b syntax that allow searching in a sample of size a and test in b; for 0 < a < 1, and 0 < b < 1. It is common to use b = 1 - a; however, it is not a hard constraint and just you need to follow a + b <= 1 and no overlapping ranges. - If -S is r2, pearsonr, or spearmanr then \(\mu\text{TC}\) computes the parameters for a regression task.

Training the model¶

At this point, we are in the position to train a model. Let us that the workload is emotions.json and that the parameters are in emotions.params then the following command will save the model in emotions.model

microtc-train -o emptions.model -m emotions.params emotions.json

You can create a regressor adding the -R option to microtc-train

Using the model¶

At this point, we are in the position to test the model (i.e, emotions.model) in a new set. That is, we are in the position to ask the classifier to assign a label to a particular text.

microtc-predict -m emotions.model -o emotions-predicted.json test-emotions.json

Finally, you can evaluate the performance of the prediction as follows:

microtc-perf gold.json emotions-predicted.json

This will show a number of scores in the screen.

{
"accuracy": 0.7025,
"f1_anger": 0.705,
"f1_fear": 0.6338797814207651,
"f1_joy": 0.7920353982300885,
"f1_sadness": 0.6596858638743456,
"macrof1": 0.6976502608812997,
"macrof1accuracy": 0.490099308269113,
"macrorecall": 0.7024999999999999,
"microf1": 0.7025,
"quadratic_weighted_kappa": 0.5773930753564155
}

or, in case of provide the –regression flag

{
"filename": "some-path/some-name.predicted",
"pearsonr": [
    0.6311471948385253,
    1.2734619266038659e-23
],
"r2": 0.3276512897198096,
"spearmanr": [
    0.6377984613587965,
    3.112636137077516e-24
]
}

\(\mu\text{TC}\) Command Line Interface¶

Training the model¶

Using the model¶

Table of Contents

Previous topic

Next topic

This Page