Upload data using command line
Importing data via command line
import
task can be use to add data via command line
CLI project flags
Arg | Shorthand | Type | Default | Description |
---|---|---|---|---|
project | p | string | "" | the name of the project to be trained |
format | f | string | "" | the input data format, Learn more |
test | t | boolean | false | only test do not save the record |
singer | s | boolean | false | set to import data from a singertap, Learn more |
ETL technologies like Singer is supported. Learn more here
CLI data flags
Arg | Type | Default | Description |
---|---|---|---|
mark-complete | boolean | false | mark all records as complete |
mark-eval | boolean | false | mark all records for accuracy evaluation(test) |
upload-name | string | * auto generated if not provided * | The name for the upload |
tags | number | '' | comma separated tags for the upload |
For importing JSONL
the parameters below will be required additionally
Arg | Type | Default | Description |
---|---|---|---|
json-map | string | {"Completed": "completed", "Key": "key", "Data": "data", "EntityLabels": "meta_data", "Prev": "prev", "Next": "next", "IsAcharya": true } | json mappings of the data passed |
To parse the above flags for JSONL json-details
needs to be specified and for IOBStyle data record-details
needs to be specified
If not specified the record/ records will be set to pending, upload name will be generated and tags shall be empty.
Sample code for importing JSONL
- Bash
- Powershell
- Command prompt
<JSONL data> | ./acharya task import -p Prj-3 -f JSONL json-details --json-map='{"Completed": "completed", "Key": "key", "Data": "Data", "EntityLabels": "Entities", "Prev": "prev", "Next": "next", "IsAcharya": false }' --upload-name="<Some name>"
<JSONL data> | ./acharya task import -p Prj-3 -f JSONL json-details --json-map="{\`"Completed\`": \`"completed\`", \`"Key\`": \`"key\`", \`"Data\`": \`"Data\`", \`"EntityLabels\`": \`"Entities\`", \`"Prev\`": \`"prev\`", \`"Next\`": \`"next\`", \`"IsAcharya\`": false }" --upload-name="<Some name>" --mark-complete
<JSONL data> | ./acharya task import -p Prj-3 -f JSONL json-details --json-map="{"\"Completed"\": "\"completed"\", "\"Key"\": "\"key"\", "\"Data"\": "\"Data"\", "\"EntityLabels"\": "\"Entities"\", "\"Prev"\": "\"prev"\", "\"Next"\": "\"next"\", "\"IsAcharya"\" : false }" --upload-name="<Some name>"
Note isAcharya
in json-map for JSONL should be only set true if json-map matches the default json map values
Use the test
flag to test the data before uploading it
Sample upload result
Total Records
: The number of records uploaded by the user in that upload
Inserted Records
: The number of records that were inserted
Invalid Records
: The number of records found to be invalid
Errors
: The number of errors that occurred during the upload
data import /upload creates an event which can be viewed on the UI
Errors
Call errors
: The errors that happen wile uploading
Import errors
: The errors that happens while importing the data
Event errors
: The errors that happens while adding an event about the upload
import
task requires login. Please follow the instructions here
Importing using Brat-standoff to JSON converter
Brat-standoff to JSON converter is a external cli tool which needs to be downloaded and run to convert brat standoff to JSON format that can be uploaded to a Project.
Using brat Standoff Converter
git clone https://github.com/astutic/bratStandoffConverter.git
OR
Download a release from here
Then run the file using go OR use the executable
Examples
Generates Acharya format for files in a specific directory and logs it to the console
go run main.go -p "./path/to/the/collection"
OR
bratconverter -p "./path/to/the/collection"
example
go run main.go -p "./testData/news"
OR
bratconverter -p "./testData/news"
Generate an output file
go run main.go -p "./path/to/the/collection" --output "path/output-file-name"
OR
bratconverter -p "./path/to/the/collection" --output "path/output-file-name"
example
The command below will generate an output file named acharyaFormat.jsonl in the current directory
go run main.go -p "./testData/news" --output "./acharyaFormat.jsonl"
OR
bratconverter -p "./testData/news" --output "./acharyaFormat.jsonl"
Generating for specific files
! NOTE the order of the .ann files an .txt files should be the same
go run main.go --ann "file1.ann,file2.ann" --text "file1.txt,file2.txt" --conf "file.conf"
example
go run main.go --ann "path/to/first.ann,path/to/second.ann" --text "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"
OR
bratconverter --ann "path/to/first.ann,path/to/second.ann" --text "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"
Commands
Command | Short hand | Type | Description | Default value |
---|---|---|---|---|
folderPath | p | string | Path to the folder containing the collection | |
ann | a | string | Comma sepeartad locations of the annotation files (.ann) in correct order | |
txt | t | string | Comma sepeartad locations of the text files (.txt) in correct order | |
conf | c | string | Location of the annotation configuration file (annotation.conf) | |
output | o | string | Name of the output file to be generated | |
force | f | bool | If you wish to overwrite the generated file then set force to true | false |
version | v | bool | Prints the version of bratconverter | false |
Original data displayed in brat
Data from Brat converted to Acharya format
[ Windows PowerShell ] If you want to use the Brat → JSONL converter and If Brat Standoff contains non English characters Then its advised to set the following in PowerShell first
$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding
Note
Features that are currently unsupported:
Importing to a Project
Since brat-standoff to JSONL converter outputs JSONL it can be imported as any other JSONL example below (Powershell):
./bratconverter -p "./testData/news" | & '.\acharya' task import -p Prj-3 -f JSONL json-details --json-map="{\`"Completed\`": \`"completed\`", \`"Key\`": \`"key\`", \`"Data\`": \`"Data\`", \`"EntityLabels\`": \`"Entities\`", \`"Prev\`": \`"prev\`", \`"Next\`": \`"next\`", \`"IsAcharya\`": false }"