- The Bench Brief
- Posts
- Data, Drama, & a Dash of Kimchi🍜
Data, Drama, & a Dash of Kimchi🍜
Hello from the bench!
Hey benchies — this week we’re going from lightning strikes to kimchi jars, from map quizzes to summer’s stubborn heat 🌞🗺️
The media world isn’t slowing down — and neither are we. Here’s what caught our eye this week 👇
Our featured content this week:
🔢🧼 Easily clean up messy databases with fuzzy matching in R
This is a super fun tutorial that shows how to clean messy text data using fuzzy matching in R. Lucia Walinchus demonstrates how tools like stringdist and tidyverse can group entries with slight spelling differences like “St. Lucie, Florida,” “Saint Lucie, FL,” and “St Lucy, Florida,” using the 2025 Investigative Reporters and Editors (IRE) conference schedule as an example. She reminds users to verify matches manually, especially before using cleaned data in high-stakes reporting. Check out the full tutorial here!

Fuzzy matching allows you to pull together text in a database that has been sorted incorrectly. Photo: Adobe Stock
🍜🖋️Finding Flavor, Finding home: How Alvin Chang blends personal narrative, data and interactive storytelling
In “The Search for My Kimchi,” Alvin Chang mixes personal memoir, data and interactive visuals to explore how recipes carry identity. He begins hunting for the “perfect” kimchi, then widens the lens: variations in family practice, historical shifts, and cultural memory all play a role. The Q&A invites readers to click, taste, and remember, showing Chang’s hand-drawn sketches and archival research alongside recipes to make the story experiential and emotional. (Read here)

Image courtesy of Alvin Chang
Cool Stuff Corner: What are we reading?
☀️📈 See how much longer summer is in your town
Summer’s overstaying its welcome — and the data proves it. A The Washington Post analysis shows the U.S. heat season has stretched by days, even weeks, compared with 30 years ago. Think fewer pumpkin spice vibes and more endless iced lattes.
From Washington, D.C. to Miami, summer now shows up earlier and lingers longer, thanks to rising global temperatures. Fall might need to send a calendar invite.
📍🏙️ How well do you know Boston’s neighborhoods?
Think you know your Back Bay from your Brighton or Hyde Park from East Boston? Boston.com challenges locals with a neighborhood quiz where you click a map to identify where each area lies. Correct picks glow green; wrong ones flash red — try until you nail ’em all. It’s all about exploring Boston’s patchwork streets and seeing just how much geography you’ve got locked down. (letting you in on my embarrassing secret, took me 19 incorrect attempts for all of Boston to glow green🙈)
💡Did you know?
USA Today has launched a generative AI chatbot called DeeperDive, replacing its traditional search box. The tool, developed with open-source models by Taboola, will suggest questions readers might want to ask—and deliver concise answers plus related articles from across the USA Today network. (read more)
From the Vault🏛️
📸🚨 How The New York Times uncovered and visualized the dangers faced by child influencers
The New York Times probed the hidden risks faced by child influencers, particularly young girls whose lives are monetized on Instagram and managed by parents. Analyzing more than 5,000 accounts, the investigation found that “racy” or “suggestive” posts attract high engagement from male followers, some of whom also pressure or stalk the children.
Using image-classification tools while preserving minors’ anonymity, the multimedia report visualized how these dynamics escalate as follower counts rise. The project underscores the darker side of fame in the social media age. (Dive into the full investigation to see what The Times uncovered.)

📊⚡How a dataset on lightning strikes taught me to not take data at face value
A data journalism tutorial by Veer Mudambi demonstrates how cleaning a lightning-strike fatality dataset reveals misleading impressions hidden in raw numbers. The author notes that small inconsistencies — “roofing” vs. “working on the roof,” or “walking to the car” vs. “walking your dog” — inflated certain activities’ danger.
After consolidating these duplicate entries, the data showed walking accounted for more fatalities than originally thought. It’s a reminder: even well-sourced data needs careful wrangling before drawing conclusions. (See how data can deceive)

That's all we've got for this week! Thanks for reading, and let us know if there's anything you'd like to see in these newsletters or in our coverage at [email protected].
And follow us on Instagram, Twitter (or X, or whatever) and LinkedIn for live updates on stories each week!