SQL String Functions and Text Manipulation
Learn SQL string functions including CONCAT, SUBSTRING, TRIM, UPPER, LOWER, REPLACE, LENGTH, and pattern matching for text processing and data cleaning.
Query
Detailed Explanation
SQL String Functions and Text Manipulation
String functions are essential for data cleaning, formatting, searching, and transforming text data within SQL queries. Most functions are consistent across databases with minor syntax variations.
Concatenation
-- Standard SQL
SELECT first_name || ' ' || last_name AS full_name FROM employees;
-- MySQL
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employees;
-- With NULL handling (skips NULLs)
SELECT CONCAT_WS(' ', first_name, middle_name, last_name) AS full_name
FROM employees;
Case Conversion
SELECT
UPPER(email) AS email_upper,
LOWER(email) AS email_lower,
INITCAP(full_name) AS name_proper -- PostgreSQL
FROM users;
Substring and Position
SELECT
SUBSTRING(phone FROM 1 FOR 3) AS area_code, -- PostgreSQL
SUBSTR(phone, 1, 3) AS area_code, -- MySQL/Oracle
POSITION('@' IN email) AS at_position,
LEFT(name, 1) AS initial,
RIGHT(phone, 4) AS last_four
FROM contacts;
Trimming and Padding
SELECT
TRIM(' hello ') AS trimmed,
LTRIM(' hello') AS left_trimmed,
RTRIM('hello ') AS right_trimmed,
LPAD(id::TEXT, 6, '0') AS padded_id,
RPAD(name, 20, '.') AS formatted_name
FROM records;
Search and Replace
SELECT
REPLACE(phone, '-', '') AS digits_only,
LENGTH(name) AS name_length,
CHAR_LENGTH(description) AS desc_length
FROM contacts;
Pattern Matching
-- LIKE (all databases)
SELECT * FROM users WHERE email LIKE '%@gmail.com';
-- SIMILAR TO (PostgreSQL)
SELECT * FROM products WHERE sku SIMILAR TO '[A-Z]{3}-[0-9]{4}';
-- REGEXP (MySQL) / ~ (PostgreSQL)
SELECT * FROM logs WHERE message ~ 'ERROR|FATAL';
String Aggregation
-- PostgreSQL
SELECT department, STRING_AGG(name, ', ' ORDER BY name) AS members
FROM employees GROUP BY department;
-- MySQL
SELECT department, GROUP_CONCAT(name ORDER BY name SEPARATOR ', ')
FROM employees GROUP BY department;
Best Practices
- Use TRIM on user inputs to remove accidental whitespace
- Store emails in lowercase for consistent comparison
- Avoid using string functions on indexed columns in WHERE clauses
- Use database collation settings for locale-aware string comparisons
- For full-text search, use dedicated indexes (tsvector in PostgreSQL, FULLTEXT in MySQL) instead of LIKE with leading wildcards
String functions are the essential tools for data cleaning, formatting output, and building search functionality directly in SQL queries.
Use Case
A data cleaning pipeline standardizes customer records by trimming whitespace, normalizing email addresses to lowercase, and extracting domain names from email fields for analytics grouping.