Methodology · Source Collection

Where Articles Come From

Updat3 ingests articles from multiple news APIs every 30 minutes, groups related articles into single stories, and removes duplicates automatically.

Article ingestion

Articles are pulled every 30 minutes from a combination of news APIs covering hundreds of international outlets. Only English-language articles are processed (or articles that can be translated automatically). Articles must have a title, a URL, and at least a brief excerpt to be stored.

WorldNews APIBroad international coverage, full article text where available

NewsData APICategory-tagged articles across major regions

Targeted searchesSupplemental searches for thin stories with only one source

Story grouping & deduplication

Multiple articles about the same event are merged into a single story. This is how you see "5 sources" under one headline instead of five separate cards.

Matching uses a combination of:

Semantic similarity

Article text is converted to a vector embedding and compared against existing story embeddings. Articles above the similarity threshold are candidates for merging.

Title overlap

Shared significant words between article and story titles provide a secondary signal.

Entity matching

Named people, places, and organisations mentioned in both articles boost the match score.

Fact overlap

Key factual statements from both articles are compared for shared content.

Matching errors happen. An article about one Iran story may occasionally be grouped with a different Iran story. We run merge and re-link passes continuously to correct this.

What gets filtered out

We filter articles at ingestion and again before AI enrichment. Filtered types include:

×Wire service roundups and news digests ("AP Morning Brief", "5 stories to know")

×Opinion, op-ed, and commentary pieces — these are not news reporting

×Press releases, job listings, event announcements, and sponsored content

×Corporate earnings boilerplate ("Q3 results", "raises price target")

×Articles with only a headline and no usable body text

×Non-English content that cannot be auto-translated with acceptable quality

AI enrichment

Once an article passes quality filters, it is processed by an AI model that writes:

Story summary (bluf)2–4 sentence plain-English summary of what happened

Key facts4–10 verified factual statements extracted from the source text

Historical context2–6 paragraphs of background, starting from the immediate trigger and going to deeper history

Story article4–10 paragraph neutral article synthesising all available sources

Why it mattersReal-world consequences for ordinary people

What to watch nextConcrete follow-up indicators

The AI is explicitly instructed to write from no government's default perspective, apply identical framing standards to all parties, and include historical context that implicates powerful actors — not just the most recent provocation.

Bias Scoring →← All methodology