Skip to content

404 Alert Triage

Overview

This document has been written to assist content and development teams in triaging elevated 404 response codes at the CloudFront layer.

This document provides:

  • Hypothesis for what the problem might be.
  • Steps to validate the hypothesis.
  • Mitigations to be considered while debugging.

Hypothesis

  • A deployment has changed routes in the application.
  • A high-traffic page has been unpublished.
  • A high-traffic file was removed from the server.
  • A user was removed from a Drupal application, along with its content.
  • A bot is generating traffic generating the errors through a scraps, phishing or malicious exercise.

Validation

Review Application Logs

Go to the CloudWatch Log Insights.

Using CloudWatch's query syntax, find any files, content, assets or endpoints returning 4xx response codes.

With this information, you can identify patterns in the requests such as common IP address ranges, paths, User Agent or common query strings.

Application-layer debugging

Go to the content administrative interface in the application and check that content and/or assets exist, and check that the public visibility of the assets, content or files. You may find one of the following:

  • A managed file is missing from Drupal in one of its file systems.
  • A homepage, landing page or other content was unpublished from public view.

Finding more on recent releases

A recent release may have changed public facing routes specific to application features. By checking the releases for something which recently made its way to production you can then look into the code changes to see if any route changes are associated with the problematic endpoints.

  1. List your Skpr releases.
  2. Find the release associated to the production environment
  3. Note the date/time the production release was created
  4. Identify if any of these 404 errors logged before the creation of this release?
  5. Identify if there are any recent releases you could fall back to if needed?
  6. Identify changes that were made to the application which are applicable to your findings?

Removed Drupal users

To identify any removed users, you may need to resort to backups, communications such as logs, issues or backups.

If you are not able to find the content you're looking for, identify if a user was removed from the system.

It may help to compare your results to a non-production environment or the daily MySQL image backups from before the logs indicate this issue started.

Mitigations

  • Can the Content Team make the content public if it is not public and should be?
  • Can the Content Team re-upload any files identified as missing?
  • Can the Development Team redeploy a previous version?
  • Can the Development Team roll-back the application?
  • Can the Skpr Platform Team block the traffic using the Web Application Firewall?