Exploring large language model: Construction, multimodal, evaluation and prospects

Beijing Zhongke Journal Publising Co. Ltd.

This study is led by Dr. Yan (Huazhong University of Science and Technology) and Dr. Liu (Huazhong University of Science and Technology).This study underscores the remarkable breakthroughs and advances in the domain of Generative Artificial Intelligence (AI) technology, in the months following the release of ChatGPT. The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields.

The study explores the construction and optimization of Large Language Models(LLM). Pre-training is the first step in developing products like ChatGPT, followed by supervised fine-tuning and reinforcement learning. Considering the high training costs, the paper introduces open-source models such as LLAMA, alpaca, vicuna, and cost-saving methods like LoRA and APE. These resources andtechnologieshelp reduce costs while maintaining model performance.

The precise evaluation of LLMs is a task of paramount importance. Presently, mainstream evaluation methodologies can be broadly categorized into three types: manual evaluation, automatic evaluation and evaluation using other LLMs. According to the evaluation results from the Chatbot Arena platform, GPT-4 significantly outperforms other models on most metrics, while a considerable gap still exists in the generation quality of numerous open-source models.

The study meticulously delineates a series of challenges currently faced by LLMs and their origins. These include the scarcity of open-source large models and datasets, the insufficiency in model stability, the difficulty in knowledge acquisition, the weakness in model interpretability, the complexity in the application, as well as issues related to security and privacy. Concurrently,three potential research directions on data, technology and applicationare pointed out in this study.

Finally, the study indicates that the currently most performant LLMs have demonstrated significant capabilities of primary general artificial intelligence. The learning and application of prompt engineering techniques can substantially enhance societal productivity. However, the technology of LLMs is still in dire need of innovative advancements, necessitating a collaborative and mutually beneficial relationship between leading corporations and the open-source community.

See the article:

The development， application， and future of LLM similar to ChatGPT

https://doi.org/10.11834/jig.230536

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

You might also like