# Redacting Excel Spreadsheets: The 11 Places Sensitive Data Hides

> Deleting cells in Excel doesn't delete the data. Learn where PII hides in spreadsheets — hidden sheets, comments, named ranges, XML, and more — and how to actually remove it.

- **Author:** Neetusha
- **Published:** 2026-04-10
- **URL:** https://www.redactifyai.com/blog/redact-excel-spreadsheets/

---

In August 2023, the Police Service of Northern Ireland (PSNI) responded to a Freedom of Information request by publishing what should have been a routine spreadsheet. Instead, due to a handling error, the published Excel file contained a hidden tab with the surnames, first initials, ranks, work locations, and departments of all 9,483 serving PSNI officers and staff — including officers working in intelligence and covert operations. The data was available online for over two hours before being taken down. Dissident republican groups confirmed they had downloaded the file. The UK Information Commissioner's Office fined the PSNI GBP 750,000 in September 2024, noting that a "simple check" would have caught the hidden sheet.

The PSNI incident is the most consequential Excel redaction failure on record, but it's not unique. In 2019, the Blackpool Teaching Hospitals NHS Foundation Trust was fined GBP 185,000 after a staff member sent a spreadsheet containing identifiable patient data to 1,523 email recipients. The data was supposed to be anonymized. It wasn't. Hidden columns and uncleared sorting references retained the original identifiers.

These failures share a root cause: Excel files are not flat text documents. An XLSX file is a ZIP archive containing XML files, relationships, metadata, and embedded objects across multiple layers. Deleting the visible content of a cell doesn't necessarily delete the data from every layer where it exists. And Excel has more hiding places for sensitive data than any other common office document format.

## The 11 places sensitive data hides in Excel

### 1. Hidden sheets

Excel allows worksheets to be hidden at two levels. A sheet can be set to "Hidden," which makes it invisible in the tab bar but recoverable through the right-click menu (Format > Sheet > Unhide). Or it can be set to "Very Hidden," which removes it from the Unhide dialog entirely — it can only be accessed through the VBA editor (Alt+F11 > View Properties > set Visible to xlSheetVisible) or by editing the workbook's XML.

This is exactly what happened in the PSNI breach. The sensitive data was on a hidden sheet that wasn't reviewed before publication. Hidden sheets are the most common vector for Excel data breaches because they're easy to create accidentally (a sheet gets hidden during development and forgotten) and not visible through normal navigation.

**How to check:** In Excel, right-click any sheet tab and select "Unhide." If the option is grayed out, there may still be Very Hidden sheets. Open the VBA editor (Alt+F11), select the workbook in the Project Explorer, and check the Visible property of each sheet object.

### 2. Hidden rows and columns

Rows and columns can be hidden by selecting them and choosing "Hide." The data remains in the file — it's just not displayed. A spreadsheet that appears to have five columns of sanitized data may actually have fifteen columns, with the hidden ten containing SSNs, account numbers, and other PII.

**How to check:** Select all cells (Ctrl+A), then go to Format > Row > Unhide and Format > Column > Unhide. Gaps in row numbers or column letters indicate hidden content.

### 3. Cell comments and notes

Excel supports both comments (threaded discussions in Microsoft 365) and notes (the older yellow sticky-note style annotations). These can contain text that doesn't appear in the cell's visible value. A reviewer might annotate a cell with "Check this SSN — doesn't match the W-9" or "Client's real income is $X, adjusted for the application." These annotations survive most copy-paste operations and are embedded in the file's XML.

**How to check:** Go to Review > Show All Comments/Notes. Or search the file's XML (rename .xlsx to .zip and inspect the comments XML files in the xl/ directory).

### 4. Named ranges

Named ranges assign a human-readable name to a cell or range of cells. They're commonly used in formulas and data validation. A named range might point to cells that have been subsequently "cleared" but where the named range definition still contains the reference — or the cells still contain data that's been hidden by formatting (white text on white background, for example).

**How to check:** Go to Formulas > Name Manager to see all defined names and the ranges they reference.

### 5. Cell formatting that obscures data

A cell can contain a value that's invisible in the display. Common techniques include:

- White text on a white background
- Custom number format ";;;" which displays nothing regardless of the cell's value
- Very small font sizes (1pt or 2pt) that appear blank on screen
- Conditional formatting rules that hide certain values

In each case, the data is fully present in the file and will appear if the cell is copied, if the formatting is changed, or if the spreadsheet is exported to CSV.

**How to check:** Select all cells, set the font color to black, reset the font size, and clear conditional formatting rules. Or export to CSV and compare.

### 6. Formulas referencing external data

Formulas can reference other workbooks, external data connections, or named ranges in other files. A formula like `='C:\Users\jsmith\Documents\[Client_SSNs.xlsx]Sheet1'!A1` not only reveals the file path (which includes a username) but also indicates the existence and location of a file containing SSNs. External references can also pull live data from sources that the recipient shouldn't have access to.

**How to check:** Go to Data > Edit Links to see all external workbook references. Use Ctrl+` (grave accent) to toggle formula view and look for external references.

### 7. Embedded objects and OLE links

Excel files can contain embedded images, PDFs, other Office documents, and OLE (Object Linking and Embedding) objects. These embedded objects exist as separate data streams within the XLSX archive. An embedded PDF might contain unredacted content even if the spreadsheet's visible cells have been sanitized. An embedded image might contain [EXIF metadata with GPS coordinates](/blog/how-to-redact-pdf-complete-guide).

**How to check:** Look for object indicators (small icons or frames) in the spreadsheet. In the file's XML structure, check the xl/embeddings/ directory.

### 8. Document properties and metadata

Every Excel file contains metadata in its properties: author name, last modified by, company name, creation date, last saved date, total editing time. This metadata is accessible through File > Info > Properties and persists through most file operations. The "author" field typically auto-populates with the Windows username or Microsoft 365 account name of the person who created the file.

**How to check:** File > Info > Properties > Show All Properties. For thorough inspection, examine the docProps/core.xml and docProps/app.xml files in the XLSX archive.

### 9. PivotTable source data

PivotTables can cache their entire source dataset within the workbook. Even if you delete the original data sheet and only keep the PivotTable, the cached data — including every row and column from the original dataset — remains in the file. Double-clicking a PivotTable value cell can regenerate the underlying records, exposing data that appears to have been removed.

**How to check:** Right-click any PivotTable, select PivotTable Options, and check whether "Save source data with file" is enabled. For a thorough check, examine the xl/pivotCache/ directory in the XLSX archive.

### 10. Data validation lists and dropdowns

Data validation rules can include custom lists or references to ranges containing sensitive information. A dropdown menu in a "Status" column might be backed by a validation list stored in a hidden sheet or a named range containing values like account numbers or employee IDs that were used during the spreadsheet's development.

**How to check:** Select cells with dropdown indicators, go to Data > Data Validation, and inspect the Source field.

### 11. The XLSX XML layer

This is the most overlooked vector. An XLSX file is a ZIP archive. Renaming the file from .xlsx to .zip and extracting it reveals the raw XML files that contain all of the spreadsheet's data. These XML files can contain:

- Deleted data that remains in the XML as orphaned elements
- Personal information in the sharedStrings.xml file (which stores all unique string values in the workbook, including values from deleted cells that haven't been purged)
- Connection strings in the connections.xml file that may include database credentials
- Custom XML parts that applications have embedded in the file

Excel's "Document Inspector" (File > Info > Check for Issues > Inspect Document) catches some of these, but not all. It won't detect Very Hidden sheets in all versions, doesn't inspect PivotTable caches thoroughly, and doesn't examine the XML layer for orphaned data.

## Why Excel's Document Inspector isn't enough

Microsoft's built-in Document Inspector is better than nothing, but it has documented limitations.

It checks for: comments, document properties, custom XML data, headers and footers, hidden rows/columns/sheets, invisible content, and external data connections.

It does not reliably check for: Very Hidden sheets in all configurations, PivotTable source data caches, orphaned data in the XML string table, formatting-based data hiding (white-on-white text, ";;;" formats), formulas that reveal sensitive file paths, embedded objects' internal content, and data validation source references.

The Inspector also operates as a blunt instrument — it offers to remove entire categories of content rather than targeting specific sensitive values. If you want to remove SSNs from comments while keeping the comments themselves, the Inspector can't do that. It will delete all comments or none.

**Excel Document Inspector vs comprehensive redaction**

  
  | Data location | Document Inspector | Dedicated redaction tool |
| --- | --- | --- |
| Hidden sheets | Detects Hidden; may miss Very Hidden in some versions | Scans all sheet visibility levels |
| Cell comments | Removes all comments (no selective redaction) | Detects and redacts PII within comments selectively |
| Document metadata | Removes all properties | Removes all properties and inspects XML layer |
| PivotTable caches | Limited detection | Extracts and scans cached source data |
| Formatting-hidden data | Does not detect | Reads cell values regardless of display formatting |
| XML orphaned strings | Does not detect | Parses XML string table for residual data |
  

## Real-world consequences of failed Excel redaction

The financial and operational impact of Excel redaction failures is well-documented.

**PSNI (August 2023): GBP 750,000 fine.** 9,483 officers and staff exposed through a hidden sheet in a FOI response. Officers in intelligence and covert roles had to be assessed for safety. The PSNI estimated the total cost — including security reassessments, counseling services, and operational changes — at tens of millions of pounds beyond the fine itself.

**Blackpool Teaching Hospitals NHS (2019): GBP 185,000 fine.** Identifiable patient data sent to 1,523 email recipients in a spreadsheet that should have been anonymized. Hidden columns retained the original identifiers that the anonymization process had missed.

**AIG subsidiary data exposure (2023).** An insurance company subsidiary inadvertently published policyholder data including policy numbers and personal details in a spreadsheet that was shared with business partners. Hidden rows containing the full dataset were not removed before distribution.

These incidents share a pattern: someone attempted to sanitize a spreadsheet by removing or hiding visible data, unaware that the file contained additional data layers that survived the sanitization.

## A proper Excel redaction process

A secure redaction workflow for Excel files addresses all eleven data vectors.

**Step 1: Inventory all content layers.** Before redacting anything, catalog what the file contains:
- Unhide all sheets, rows, and columns
- Show all comments and notes
- Check Name Manager for all defined names
- Toggle formula view to identify external references
- Check for embedded objects
- Review document properties
- Inspect PivotTable cache settings
- Check data validation sources

**Step 2: Identify sensitive data across all layers.** Scan cell values, comments, formulas, metadata, and embedded content for PII patterns. This means checking not just the visible worksheet content but the sharedStrings.xml string table, the PivotTable cache XML, and any embedded documents.

**Step 3: Redact at the data level.** Replace sensitive values with redaction markers ("[SSN]", "[ACCOUNT]") or remove them entirely. Clear the values from cells rather than just hiding rows or columns. Delete comments containing PII or edit them to remove the sensitive content. Break and remove external references that reveal sensitive file paths.

**Step 4: Strip metadata.** Remove document properties, author information, and other metadata. Use Document Inspector as a first pass for the categories it handles well, then manually verify the XML layer.

**Step 5: Rebuild and verify.** Save the file as a new XLSX (Save As, not Save) to force Excel to rebuild the file structure. Then verify the redaction by extracting the new XLSX as a ZIP and searching the XML files for any sensitive data that survived the process. A text search across all XML files in the archive catches data that visual inspection misses.

For organizations handling sensitive spreadsheets regularly, this five-step process is too labor-intensive to perform manually on every file. Automated detection and redaction — where a tool scans all eleven data layers and removes identified PII in a single pass — is the practical approach for volume operations.

## How RedactifyAI handles spreadsheet redaction

[RedactifyAI](/) processes Excel files by reading the complete file structure — not just the visible cell content. The detection pipeline scans cell values, comments, metadata, and the underlying XML for PII patterns including SSNs, names, email addresses, phone numbers, and financial identifiers. Hidden sheets, hidden rows and columns, and formatting-masked data are included in the scan automatically.

Redaction replaces identified PII with category labels that preserve the spreadsheet's structure while removing the sensitive data. A financial model with client SSNs becomes a financial model with "[SSN]" placeholders — the formulas still work, the layout is intact, but the personal identifiers are gone.

For organizations responding to FOIA requests, producing litigation discovery, or sharing financial data with partners, this means one pass through RedactifyAI replaces the multi-step manual process that failed for the PSNI, Blackpool NHS, and dozens of other organizations that trusted visual inspection to catch everything.

RedactifyAI currently handles [PDF](/blog/how-to-redact-pdf-complete-guide), [Word document](/blog/how-to-redact-in-word), and image redaction with AI-powered PII detection. Excel spreadsheet support is on our roadmap. In the meantime, for spreadsheets containing sensitive data, export to PDF first and [run it through our free redaction tool](/tools/redact-pdf-free/) — or [sign up free](https://app.redactifyai.com/auth/signup) for full multi-format redaction across your team's documents.

## Frequently asked questions

### Does deleting a cell in Excel actually remove the data?

Not always. Clearing a cell's visible content removes the displayed value, but the data may persist in several places: the sharedStrings.xml string table (which stores unique strings and may not be purged when cells are cleared), PivotTable caches that snapshot the data, named ranges that reference the cleared cells, and any comments or notes attached to those cells. Saving the file as a new XLSX with "Save As" forces a file rebuild that removes some orphaned data, but a thorough cleanup requires inspecting the XML layer.

### What is a "Very Hidden" sheet in Excel?

A Very Hidden sheet is a worksheet whose Visible property is set to xlSheetVeryHidden (value 2) through VBA or the XML. Unlike a regular hidden sheet, a Very Hidden sheet does not appear in the Unhide dialog (right-click sheet tab > Unhide). It can only be accessed through the VBA editor or by editing the workbook's XML directly. This makes it easy to overlook during manual redaction. The PSNI breach involved data on a hidden sheet that was not checked before publication.

### Can PivotTables expose data from deleted sheets?

Yes. When a PivotTable is created with "Save source data with file" enabled (which is the default), the complete source dataset is cached within the workbook's pivotCache. If you delete the original data sheet but keep the PivotTable, the cached data remains in the file. Double-clicking a PivotTable summary cell generates a new sheet with the underlying detail records from the cache — effectively restoring data from the "deleted" sheet.

### Is Excel's Document Inspector sufficient for compliance?

For basic metadata removal, Document Inspector is a useful first step. For compliance-grade redaction where personal data must be verifiably removed — [GDPR right of erasure](/blog/redact-documents-gdpr-hipaa-compliance), [HIPAA](https://www.hhs.gov/hipaa/index.html) de-identification, [CCPA](https://oag.ca.gov/privacy/ccpa) deletion requests — Document Inspector has significant gaps. It doesn't detect data hidden through formatting, doesn't thoroughly scan PivotTable caches, doesn't examine the XML string table for orphaned data, and can't selectively redact PII from comments or cell values. Compliance-grade redaction requires either a thorough manual process or a dedicated redaction tool that addresses all data layers.

### How did the PSNI data breach happen?

In August 2023, the PSNI responded to a Freedom of Information request about staff numbers by publishing a spreadsheet. The published file contained a hidden worksheet with the surnames, first initials, ranks, locations, and departments of all 9,483 serving officers and staff. The data was downloadable for over two hours. The UK Information Commissioner's Office attributed the breach to inadequate data handling procedures and fined the PSNI GBP 750,000. The total cost including security reassessments for officers in sensitive roles was estimated at tens of millions of pounds.