Software Engineer / Data Scientist
Hi! I am Yuan and I am now actively open to jobs. I just graduated from USC master in ECE (Machine Learning and Data Science). During my master, I worked as a research assistant for one year and contributed to many NLP related projects. I focused on information retrival, question answering and data augmentation in many fields. If you are interested in my profile, please contact me for any questions and suggestions.
Combined previous work with the newest trend. Inspired by scale laws paper, started to work on comparing cost of the following three methods when achieving same accuracy:1) in-context learning 2) few-shot finetuning 3) synthetic data generation with large models. Developed skills: Slurm, GitHub (maintain a project repo), CUDA, conda environment.
Worked on QuALITY (a question and answering dataset) with synthetic data augmentation. Shifted research direction after GPT-4’s emergence. Developed skills: python, PyCharm, GitHub, conda environment.
Helped with experiments complement through Python coding and method details checking. Organize the GitHub repo in the final stage.
Analyzed VLN agents by adding noise to the pretraining data. Proposed a unigram + landmark method to lower the cost of data creation and augmentation. Designed a nonsense-data augmentation method for effective VLN pretraining.
My co-2nd-auther work has been sent to the EMNLP 2023 conference.link will be public soon
Piano audio recognition based on MATLAB
Communication system simulation based on MATLAB platform
Audio recognition project based on sound dynamics signals (team leader)