Skip to main content

Upload data using command line

Importing data via command line

import task can be use to add data via command line

CLI project flags

ArgShorthandTypeDefaultDescription
projectpstring""the name of the project to be trained
formatfstring""the input data format, Learn more
testtbooleanfalseonly test do not save the record
singersbooleanfalseset to import data from a singertap, Learn more
info

ETL technologies like Singer is supported. Learn more here

CLI data flags

ArgTypeDefaultDescription
mark-completebooleanfalsemark all records as complete
mark-evalbooleanfalsemark all records for accuracy evaluation(test)
upload-namestring* auto generated if not provided *The name for the upload
tagsnumber''comma separated tags for the upload

For importing JSONL the parameters below will be required additionally

ArgTypeDefaultDescription
json-mapstring{"Completed": "completed", "Key": "key", "Data": "data", "EntityLabels": "meta_data", "Prev": "prev", "Next": "next", "IsAcharya": true }json mappings of the data passed
info

To parse the above flags for JSONL json-details needs to be specified and for IOBStyle data record-details needs to be specified

If not specified the record/ records will be set to pending, upload name will be generated and tags shall be empty.

Sample code for importing JSONL

<JSONL data> | ./acharya task import -p Prj-3 -f JSONL json-details --json-map='{"Completed": "completed", "Key": "key", "Data": "Data", "EntityLabels": "Entities", "Prev": "prev", "Next": "next", "IsAcharya": false }' --upload-name="<Some name>"
note

Note isAcharya in json-map for JSONL should be only set true if json-map matches the default json map values

tip

Use the test flag to test the data before uploading it

Sample upload result

sample upload result CLIsample upload result CLI

Total Records: The number of records uploaded by the user in that upload

Inserted Records: The number of records that were inserted

Invalid Records: The number of records found to be invalid

Errors: The number of errors that occurred during the upload

info

data import /upload creates an event which can be viewed on the UI

Errors

Call errors: The errors that happen wile uploading

Import errors: The errors that happens while importing the data

Event errors: The errors that happens while adding an event about the upload

note

import task requires login. Please follow the instructions here

Importing using Brat-standoff to JSON converter

Brat-standoff to JSON converter is a external cli tool which needs to be downloaded and run to convert brat standoff to JSON format that can be uploaded to a Project.

Using brat Standoff Converter

 git clone https://github.com/astutic/bratStandoffConverter.git

OR

Download a release from here

Then run the file using go OR use the executable

Examples

Generates Acharya format for files in a specific directory and logs it to the console

go run main.go -p "./path/to/the/collection"

OR

bratconverter -p "./path/to/the/collection"

example

go run main.go -p "./testData/news"

OR

bratconverter  -p "./testData/news"

Generate an output file

go run main.go -p "./path/to/the/collection" --output "path/output-file-name"

OR

bratconverter  -p "./path/to/the/collection" --output "path/output-file-name"

example

The command below will generate an output file named acharyaFormat.jsonl in the current directory

go run main.go -p "./testData/news" --output "./acharyaFormat.jsonl"

OR

bratconverter  -p "./testData/news" --output "./acharyaFormat.jsonl"

Generating for specific files

! NOTE the order of the .ann files an .txt files should be the same
go run main.go --ann "file1.ann,file2.ann" --text "file1.txt,file2.txt" --conf "file.conf"

example

go run main.go --ann "path/to/first.ann,path/to/second.ann" --text "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"

OR

bratconverter  --ann "path/to/first.ann,path/to/second.ann" --text "path/to/first.txt,path/to/second.txt" --conf "path/to/annotation.conf"

Commands

CommandShort handTypeDescriptionDefault value
folderPathpstringPath to the folder containing the collection
annastringComma sepeartad locations of the annotation files (.ann) in correct order
txttstringComma sepeartad locations of the text files (.txt) in correct order
confcstringLocation of the annotation configuration file (annotation.conf)
outputostringName of the output file to be generated
forcefboolIf you wish to overwrite the generated file then set force to truefalse
versionvboolPrints the version of bratconverterfalse

Original data displayed in brat

Original data displayed in bratOriginal data displayed in brat

Data from Brat converted to Acharya format

Brat data displayedBrat data displayed
note

[ Windows PowerShell ] If you want to use the Brat → JSONL converter and If Brat Standoff contains non English characters Then its advised to set the following in PowerShell first

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

Note

Features that are currently unsupported:

Importing to a Project

Since brat-standoff to JSONL converter outputs JSONL it can be imported as any other JSONL example below (Powershell):

./bratconverter  -p "./testData/news" | & '.\acharya' task import -p Prj-3 -f JSONL json-details --json-map="{\`"Completed\`": \`"completed\`", \`"Key\`": \`"key\`", \`"Data\`": \`"Data\`", \`"EntityLabels\`": \`"Entities\`", \`"Prev\`": \`"prev\`", \`"Next\`": \`"next\`", \`"IsAcharya\`": false }"