Scripts to facilitate import/export between ELAN and FLEx
The purpose of this part of flibl is to let FLEx display conversational texts (i.e. non-monologic) in a way that can be understood to involve multiple people. Namely, we put a note at the bottom of each utterance to say who is speaking/signing.
becomes
You can run elan_tiers.py to create a file where you can view the IDs, Types, and Participants for each of the tiers in your EAF file; this will be helpful if you want to specify if certain tiers are to be excluded or considered translation tiers.
elan_tiers.pyelan_tiers.py is a Python script you can use to identify
what tiers you have in your ELAN file. Within the file, enter the name
of the EAF you would like to examine where indicated (below the line
saying
# Give the file name (using the relative path to this Python file or absolute path)).
Make sure to include the path to that file, so if it’s within another
folder specify by filling out the line as:
eaf_file = "./inner_folder/deeper_folder/file_name.eaf"
The period . at the beginning means that you’re
referring the current folder, where the Python program is located. If
your EAF is located outside of the folder with
elan_tiers.py, you can use the absolute path of the text
here, something like
eaf_file = "/home/username/Documents/field_files/session01/file_name.eaf"
The output of the script will be in a text file that has the name of your EAF file suffixed with “_tiers.txt” (in this case, it will be file_name_tiers.txt). You’ll find it in the folder where the original EAF is located. The information displayed will be each tier ID accompanied by the ELAN “type” you gave that tier and the participant speaking/signing in that tier:
TIER_ID:
LINGUISTIC_TYPE_REF:
PARTICIPANT:
That information is important for using flibl, because
you’ll need to specify information about which tiers to include when you
run it.
flibl_config.jsonflibl_config.json is the file where you will put the
information for flibl to use while processing your EAF.
Open this file in a text editor. If you have a Mac, you can use
TextEdit; on Windows, you can use Notepad; and on Linux you can use
gedit. Those are just the default text editors though–feel free to use
whatever you prefer if it’s not those. This kind of file, JSON, is used
to define things in pairs of corresponding values as well as lists. When
something is enclosed in {braces}, it is part of an object,
i.e. something that will use corresponding value pairs called a
key and value. For example:
{
"my_favorite_food":"cheese"
}
When something is enclosed in [brackets], it is part of an array, i.e. a list of things. For example:
[
"cheese",
"apples"
]
They can also be nested:
{
"my_favorite_foods":[
"cheese",
"apples"
]
}
The way flibl_config.json is structured is as follows:
file_names: In order to configure flibl for use with
your file, you’ll need to add the name of your file. Write the file name
using the convention described above in (1) to make sure the program
looks in the right location.language_fonts: In this field you will fill in the
array with objects. All objects must have keys and values for “lang”
(the short code used in FLEx) and font (again, as defined in your FLEx
database). For the languages you are using as “Vernacular” languages in
FLEx, you must add another key-value pair of “vernacular”:“true”languages: These are the languages you are using and
the FLEx codes that correspond to them.
main_language: The language in studychild_language: If you’re using a different language
code for the language used by children when they make errors, put that
code here. Otherwise, you can just put in the same code you put into
main_languageflex_language: This is the language that you have
FLEx’s interface in. If you’re working with an English interface, write
en, if you’re using Spanish FLEx, write es,
etcvalid_characters_regexes: Under
main_language, put the RegEx pattern that matches all the
characters that are valid for the main language; likewise with
child_language. A RegEx,
or Regular Expression, is a series of characters used to represent
possible sets of characters. In the example config file, you’ll see that
some characters are written with their Unicode hex values–this is
because FLEx is very particular about its encoding, so make
sure you know if you have special characters. You can copy and paste
your special characters into the top of this site
and hit “Convert” to see the hex codes (the one under U+hex). Paste
these into the config file in the format code). The example config file
also has all of the apostrophe-like characters, so if the language(s)
you work with has an apostrophe-like character that gets written with
multiple characters, feel free to copy and paste those into yours. The
reason for this beinga RegEx is really just to be able to shorten the
a-z and A-Z by being able to write just that, rather than have to write
out the entire alphabet. If you want to be more selective, do what you
need to!exclude_tier_id: Write the tier IDs of the tiers to be
completely excluded from this export process. If you don’t, and the
tiers don’t follow the general expected format, the program will throw
errors. This is why it’s helpful to run elan_tiers.py, so
you know what the tier IDs are.exclude_tier_type: Similar to the previous field, here
you should put in any tier types you would like to exclude. If
you have, for example, a special tier for notes that you don’t want
included, and you’ve given that a unique tier type, you can put the name
of the type here so you don’t have to write all the IDs of that type.
Again, elan_tiers.py is helpful here.translation_tiers: Here, put key-value pairs of tier
IDs and language codes for translation tiers.target_utterance_tier_type: If you have target
utterances on a particular tier type, put the name of that tier
type (NOT tier ID) here. If you don’t have target utterances on
a particular unique tier type, you should go into ELAN and change it so
it fits this pattern.Assuming you know how to run a Python program: go into the FLExible
directory where you downloaded this folder, then run the program!
python flextext_construction_base-v03-0.py (or replace
python with python3 if that’s your command for
running Python 3.x)
Then, in FLEx, import the text. Click “FLExText Interlinear…”