Auto-Detect Column Boundaries from Text
Let the tool automatically discover column boundaries in fixed-width text by analyzing whitespace patterns across all rows.
Detailed Explanation
Automatic Column Detection
When you receive a fixed-width file without documentation, manually figuring out where each column starts and ends can be tedious. The Auto-detect feature analyzes your data and infers column boundaries automatically.
How the Algorithm Works
- Scan all lines and record the length of the longest line
- Count spaces at each character position across all lines
- Identify gaps — positions where 70% or more of lines have a space character
- Find transitions between gap regions and data regions
- Build columns using transition points as boundaries
Example
Input text:
Alice Engineering Senior 85000
Bob Marketing Junior 62000
Charlie Engineering Lead 95000
Diana Sales Manager 78000
The algorithm detects spaces at positions 12-13, 27-28, and 38-39 across all lines. It generates:
| Column | Start | Width |
|---|---|---|
| Column1 | 0 | 12 |
| Column2 | 12 | 15 |
| Column3 | 27 | 11 |
| Column4 | 38 | 5 |
When Auto-Detect Works Best
- Text where columns are clearly separated by runs of spaces
- Data with consistent column widths across all rows
- Files where no field values contain multiple consecutive spaces
When Auto-Detect Struggles
- Data with no spacing between adjacent columns (e.g., zero-padded numerics touching text)
- Columns containing empty values that create false space runs
- Data with irregular formatting or mixed-width lines
After auto-detection, you can rename columns from the default Column1, Column2, etc. and adjust widths if the detection is slightly off.
Use Case
Quickly parsing undocumented fixed-width files received from external partners, legacy systems, or government data portals where the column specification is unknown or lost.