Open
Description
Thanks for your excellent working!
I want to training my g2p with other language, in my case is vietnamese
phạc ph a_T5 c2
num n u_T0 m2
rim r i_T0 m2
giẫn gi a3_T4 n2
toăm t oa2_T0 m2
lịu l iu_T5
cựi c u2i_T5
õng o_T4 ng2
I get error:
INFO:phonetisaurus-train:2023-09-06 17:51:21: Checking command configuration...
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: Directory does not exist. Trying to create.
INFO:phonetisaurus-train:2023-09-06 17:51:21: Checking lexicon for reserved characters: '}', '|', '_'...
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: arpa_path: train/model.o8.arpa
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: corpus_path: train/model.corpus
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: dir_prefix: train
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: grow: False
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: lexicon_file: /tmp/tmp53qaxdn7.txt
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: logger: <Logger phonetisaurus-train (DEBUG)>
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: makeJointNgramCommand: <bound method G2PModelTrainer._mitlm of <__main__.G2PModelTrainer object at 0x7fd3aae0ec40>>
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: model_path: train/model.fst
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: model_prefix: model
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: ngram_order: 8
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: seq1_del: False
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: seq1_max: 2
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: seq2_del: True
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: seq2_max: 2
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: verbose: True
DEBUG:phonetisaurus-train:2023-09-06 17:51:21: phonetisaurus-align --input=/tmp/tmp53qaxdn7.txt --ofile=train/model.corpus --seq1_del=false --seq2_del=true --seq1_max=2 --seq2_max=2 --grow=false
INFO:phonetisaurus-train:2023-09-06 17:51:21: Aligning lexicon...
GitRevision: package
Loading input file: /tmp/tmp53qaxdn7.txt
Please provide a valid input file.
ERROR:phonetisaurus-train:2023-09-06 17:51:21: Alignment failed. Exiting.
Traceback (most recent call last):
File "/home/tupk/anaconda3/envs/nlp/bin/phonetisaurus", line 8, in <module>
sys.exit(main())
File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/site-packages/phonetisaurus/__main__.py", line 74, in main
do_train(args, casing, env)
File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/site-packages/phonetisaurus/__main__.py", line 209, in do_train
train(lexicon=lexicon, model_path=args.model, corpus_path=args.corpus, env=env)
File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/site-packages/phonetisaurus/__init__.py", line 121, in train
subprocess.check_call(train_cmd, cwd=temp_dir_str, env=env)
File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['phonetisaurus-train', '--lexicon', '/tmp/tmp53qaxdn7.txt', '--seq2_del', '--verbose']' returned non-zero exit status 1.
But if I train with English lexicon, no problem
How can I fix it?
Thank you
Metadata
Metadata
Assignees
Labels
No labels