STORM: An innovative writing system

Developed by Stanford University researchers, it aims to use the big language model to automate the writing of long articles with the breadth and depth of Wikipedia from scratch.

STORM can automatically collect information from multiple angles and create an outline by simulating a dialogue process like expert questioning, ultimately generating text with citations, and writing a complete article section by section.

Main challenges and solutions:

Challenge: Wikipedia-style articles require in-depth research and planning, including extensive collection of reference materials and elaborate outlines. Existing efforts to generate Wikipedia articles often bypass this pre-writing stage.

Solution: STORM automates the process by simulating the pre-writing, drafting and revision phases of the human writing process, especially during the pre-writing phase, and automates the process through effective question asking.

STORM’s workflow:

1. Discover different perspectives: STORM first explores topics from multiple sources and perspectives by retrieving and analyzing Wikipedia articles similar to a given topic to ensure the comprehensiveness and depth of the content and discover diverse perspectives when researching topics.

2. Simulated dialogue: Next, the system simulates the dialogue process in which the writer asks questions to the topic expert. This step uses LLMs to generate in-depth questions with the purpose of deepening understanding of the topic from different perspectives. These conversations are based on trusted resources on the Internet.

3. Create an outline: Based on the information collected and the questions raised, STORM automatically creates an outline of the article. This outline aims to organize the structure of the article to ensure breadth and depth of content coverage.

In the final writing stage, STORM generates text with citations and writes a complete article section by section.

The STORM system is designed to solve the following major issues:

1. Automation of pre-writing research: In the traditional process of writing long articles, pre-writing research (including topic research, information collection and outline production) is a time-consuming and complex task. By automating this process, STORM helps authors efficiently collect and organize the information they need, thereby improving writing efficiency.

2. Integration of information from multiple perspectives: For any given topic, exploring and understanding information from different perspectives is the key to producing comprehensive and in-depth articles. STORM automatically collects and organizes topic-related information from multiple perspectives by simulating conversational questions to ensure the comprehensiveness and depth of the article content.

3. Generate a structured article outline: A clear and logical article outline is the basis for high-quality writing. The STORM system innovatively uses retrieved information and questions asked to automatically create outlines, helping authors remain organized and purposeful during the writing process.

4. Improve the quality of articles: Through the above-mentioned automated pre-writing research and outline production process, STORM aims to generate more organized articles with wider content coverage, thereby directly improving the quality of the final article.

Evaluation results:

FreshWiki dataset: To evaluate the STORM system, the research team created the FreshWiki dataset, a dataset containing the latest high-quality Wikipedia articles to test the quality of the articles generated by the system.

Outline quality assessment: Through expert review and automated assessment methods, the outlines generated by STORM perform well in terms of organization and coverage, showing significant improvements compared to the baseline model.

Improved writing quality: Compared to articles generated based on an outline driven search enhanced baseline, STORM generated articles performed better in terms of organization (absolute increase of 25%) and coverage (10% increase).

Expert feedback: Feedback from experienced Wikipedia editors also confirmed the effectiveness of STORM in generating well-documented long articles and pointed out directions for future improvements, such as source bias shifting and excessive correlation of irrelevant facts.

Thesis:https://arxiv.org/abs/2402.14207
PDF:https://arxiv.org/pdf/2402.14207.pdf

Video:

Scroll to Top