ROI Case File No.245.5 | The True Culprit Behind the Vanishing OGP Images

📅 2025-10-10 03:00

🕒 Reading time: 8 min

🏷️ 5WHYS


ICATCH


Prologue: The Mystery of the Vanishing Images

That day, a request arrived at the ROI Detective Agency.

"Detective, when I share my articles on X, the images don't appear. My OGP settings should be perfect... I have 240 articles, but none of them show images properly."

The client was an entrepreneur running a business analysis blog. Despite pouring passion into writing articles, they weren't displaying properly on social media. Without proper presentation, even quality content fails to catch people's attention.

"I see. Let me examine the scene first."

I opened my browser and launched the developer tools. On the surface, it appeared to be a simple OGP configuration issue. However, my detective's intuition whispered: There's another truth hidden in this case.


Chapter 1: Examining the Suspects

Suspect List: 1. Incorrect OGP configuration 2. Image file problems 3. Cache issues 4. URL parameter errors

"Let's start with the basics."

I scrutinized the HTML source code.

<meta property="og:image" content="https://roi-blog.playground.style/static/img/articles/roi_case_file_244/icatch.jpeg?v=2025-09-30 11:00" />

"Wait... this URL parameter contains a space."

Spaces are invalid characters in URLs. This could be confusing X's crawler.

"Is this the culprit?"


Chapter 2: Arrest and Release of the First Suspect

I immediately implemented a fix.

og_version = datetime.now().strftime('%Y%m%d%H%M%S')
# 2025-09-30 11:00 → 20251009143025

Removing spaces and implementing dynamic timestamp generation. After deployment, images appeared on new articles.

"Good, case solved."

...Or so I thought.

Days later, the client contacted me again.

"Detective, images appear on new articles, but old articles still don't show them. And it takes 12 hours for them to appear!"

"Strange... the configuration should be the same."

There's still an unseen truth in this case.


Chapter 3: Hidden Clues

"Why do new articles display images while old articles don't?"

I formulated alternative hypotheses: - X's cache issue? - Would shortened URLs solve it? - Is a 12-hour waiting period necessary?

I tried various countermeasures: shortened URLs, Card Validator, URL parameter modifications...

However, none provided a definitive solution. Some articles began displaying images, but problems persisted with others.

"It's like playing whack-a-mole."

Something didn't add up. I was missing a fundamental problem.

The cause must lie not in surface symptoms, but in deeper structural issues.


Chapter 4: Lighthouse Reveals the Truth

"Wait... have I ever measured the page loading speed?"

I launched Chrome DevTools' Lighthouse.

Seconds later, the score appeared. And there, an unbelievable number was displayed.

Largest Contentful Paint (LCP): 5.72 seconds

"5.72 seconds?"

LCP measures the time until the largest content element on a page (in this case, the article's featured image) is displayed. The ideal is under 2.5 seconds. Above 4 seconds is rated as "Poor."

5.72 seconds was completely unacceptable.

"This is... not an OGP image problem. This is a site-wide issue."


Chapter 5: Network Tab Exposes the Culprit's Shadow

For a more detailed investigation, I opened the Network tab.

First HTML Request:
- Waiting (TTFB): 5.76 seconds
- Content Download: 0.005 seconds

TTFB (Time To First Byte) — the time until the server returns HTML.

"5.76 seconds...?"

Normally, this value should ideally be under 0.2 seconds. Users begin to perceive slowness above 1 second.

5.76 seconds was abnormal.

"The image file is only 21KB, incredibly lightweight. So why does it take this long?"

I was certain.

The real reason OGP images weren't displaying was that the server response was too slow, causing X's crawler to timeout.


Chapter 6: Narrowing Down the Suspects

What was happening on the server side? I began experiments, commenting out each process one by one to identify the bottleneck.

Both active: 5.76 seconds
Weekly Ranking TOP3 only: 1.31 seconds
Next/Previous processing only: 5.47 seconds ← This is it!
Both disabled: 0.057 seconds

The suspects were narrowed down.

"The 'Next/Previous processing' takes 5.5 seconds... what exactly is it doing?"


Chapter 7: The Code Speaks Truth

I scrutinized the code. And there, I saw the true culprit's identity.

def get_article_list(dir_name: str):
    """Retrieve list of published articles"""
    files = [f[:-3] for f in os.listdir(ARTICLES_DIR) if f.endswith('.md')]
    files.sort()

    published_files = []
    for filename in files:  # Loop 240 times
        meta, article = load_article(dir_name=dir_name, file_name=filename)
        # ↑ Loading and parsing entire Markdown files

        if not meta.get('published'):
            continue

        # Date checking...
        published_files.append(filename)

    return published_files

It was loading the complete Markdown file for all 240 articles, every single time.

Each article has a 33-minute reading time—meaning very large files. Loading and parsing all of them, 240 times, every request.

240 articles × ~25ms = ~6 seconds

"So this is the true culprit..."


Chapter 8: The Culprit's Motive

How did this happen?

When there were few articles (10, 20), there was no problem. But as articles increased:

Lack of scalability. Growth had become a shackle.


Chapter 9: Path to Resolution

"All we need is metadata. There's no need to read the body content."

I devised an optimization plan.

Solution 1: Load Only Front Matter

def parse_front_matter_only(file_path: str) -> dict:
    """Fast loading of only YAML Front Matter"""
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()

        # Extract only Front Matter (section enclosed by ---)
        if content.startswith('---'):
            end = content.find('---', 3)
            if end != -1:
                front_matter_str = content[3:end]
                return yaml.safe_load(front_matter_str) or {}

    return {}

Solution 2: Implement Caching

# Global cache
_ARTICLE_LIST_CACHE = {}

def get_article_list(dir_name: str):
    # Return from cache if available
    if dir_name in _ARTICLE_LIST_CACHE:
        return _ARTICLE_LIST_CACHE[dir_name]

    # Read files only when cache is absent
    published_files = []
    for filename in files:
        meta = parse_front_matter_only(file_path)  # Lightweight version
        # Checking process...
        published_files.append(filename)

    # Save to cache
    _ARTICLE_LIST_CACHE[dir_name] = published_files

    return published_files

Chapter 10: Dramatic Improvement

After implementation, I measured again with the Network tab.

Before improvement: 5.76 seconds
After improvement: 0.098 seconds

Approximately 58x faster

The Lighthouse score also improved dramatically.

Before improvement LCP: 5.72 seconds
After improvement LCP: 1.2 seconds (estimated)

And most importantly—

OGP images now displayed instantly on all articles.


Epilogue: The Detective's Insight

"Client, the real reason OGP images weren't displaying was that the server response was too slow."

"If it takes 6 seconds, X's crawler will timeout. Even with correct OGP configuration, it's meaningless if the information can't be retrieved."

The client looked surprised.

"So, the OGP images were... victims?"

"Exactly. The true culprit was '240 full Markdown file loads.' This unnecessary processing was slowing everything down."


【Case Resolution Key Points】

🔍 Pursuing Root Cause with 5 Whys Analysis

  1. Why don't OGP images display? → Because X's crawler is timing out

  2. Why does it timeout? → Because server response is too slow (5.76 seconds)

  3. Why is server response slow? → Because Next/Previous processing takes 5.47 seconds

  4. Why does that processing take so long? → Because it loads all 240 Markdown files every time

  5. Why is it necessary to load them every time? → Because there's no caching, and no distinction between necessary and unnecessary data

Root Cause: Repetition of unnecessary processing (240 full Markdown file loads)


【Applied Frameworks】

Performance Profiling - Lighthouse: Visualizing overall page performance - Network Tab: Identifying bottlenecks - Timing API: Measuring detailed processing times

Root Cause Analysis - Not surface symptoms (OGP images) - But pursuing deeper structures (server processing)

Optimization Strategy - Reducing unnecessary processing (Full Text → Front Matter Only) - Utilizing cache (240 loads → 1 load) - Ensuring scalability (fast even as articles increase)


【Implications for Data Analysis】

This case demonstrates important lessons for data analysis as well.

Lesson 1: Don't Be Misled by Surface Symptoms - OGP image problem → Actually a server processing problem - Sales decline → Actually a customer experience problem - Low conversion rate → Actually a page speed problem

Lesson 2: Verify with Data - Judge by measurement, not speculation - "Tools" like Lighthouse and Network tab illuminate truth - Same in data analysis: Verify with GA4, heatmaps, A/B tests

Lesson 3: Consider Scalability - Processing that's fine with 10 articles breaks down at 240 - System that's fine with 100 users breaks down at 100,000 - Design with growth in mind is essential


【ROI Detective's Maxims】

"90% of problems lie not in surface symptoms, but in deeper structural causes."

"Reasoning without data is fantasy. Optimization without measurement is futile."

"I was investigating an OGP case and discovered a loading case. This is the essence of detective work."

"Behind small problems, great truths are hidden. A detective holds the key to open that door."


【ROI of Improvements】

Time Value: - Time saved: 5.7 seconds per page × number of visitors - Monthly 2,000 PV → ~3 hours of user time saved - Yearly 24,000 PV → ~38 hours saved

Business Value: - Reduced bounce rate: 20% decrease (estimated) from page speed improvement - Increased SNS share rate: 30% CTR improvement (estimated) from OGP image display - Improved SEO ranking: Page speed is a ranking factor - Reduced server costs: Load reduced to 1/58 by eliminating unnecessary processing

Learning Value: - Performance optimization practice - Bottleneck identification techniques - Cache strategy design - Importance of scalability


"Building small worlds and connecting them. This time, from the small entrance of OGP images, we created great value: site-wide performance improvement."—From the ROI Detective Agency records


This case file became a monumental solution etched in the history of the ROI Detective Agency.

🎖️ Top 3 Weekly Ranking of Case Files

ranking image
🥇
Case File No. 226
'The Crossroads of Middle Eastern Real Estate Development'

A rapidly growing Middle Eastern real estate development company. Blue Ocean Strategy revealed that 'unexplored vacant territories' rather than competitive seas held the key to the future.
ranking image
🥈
Case File No. 208
'The Ambiguous Customer Understanding of North American Healthcare Enterprise'

A North American healthcare company expanding digital diagnosis services. But EMPATHY mapping revealed shallow customer understanding.
ranking image
🥉
Case File No. 240
'The Transformation of a European Fashion Brand'

A long-established European fashion brand introduces OKR. They achieved employee awareness reform by concretizing ambiguous goals.

Solve Your Business Challenges with Kindle Unlimited!

Access millions of books with unlimited reading.
Read the latest from ROI Detective Agency now!

Start Your Free Kindle Unlimited Trial!

*Free trial available for eligible customers only