SQL String Functions and Text Manipulation

Learn SQL string functions including CONCAT, SUBSTRING, TRIM, UPPER, LOWER, REPLACE, LENGTH, and pattern matching for text processing and data cleaning.

Query

Detailed Explanation

SQL String Functions and Text Manipulation

String functions are essential for data cleaning, formatting, searching, and transforming text data within SQL queries. Most functions are consistent across databases with minor syntax variations.

Concatenation

-- Standard SQL
SELECT first_name || ' ' || last_name AS full_name FROM employees;

-- MySQL
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employees;

-- With NULL handling (skips NULLs)
SELECT CONCAT_WS(' ', first_name, middle_name, last_name) AS full_name
FROM employees;

Case Conversion

SELECT
  UPPER(email) AS email_upper,
  LOWER(email) AS email_lower,
  INITCAP(full_name) AS name_proper  -- PostgreSQL
FROM users;

Substring and Position

SELECT
  SUBSTRING(phone FROM 1 FOR 3) AS area_code,   -- PostgreSQL
  SUBSTR(phone, 1, 3) AS area_code,              -- MySQL/Oracle
  POSITION('@' IN email) AS at_position,
  LEFT(name, 1) AS initial,
  RIGHT(phone, 4) AS last_four
FROM contacts;

Trimming and Padding

SELECT
  TRIM('  hello  ') AS trimmed,
  LTRIM('  hello') AS left_trimmed,
  RTRIM('hello  ') AS right_trimmed,
  LPAD(id::TEXT, 6, '0') AS padded_id,
  RPAD(name, 20, '.') AS formatted_name
FROM records;

Search and Replace

SELECT
  REPLACE(phone, '-', '') AS digits_only,
  LENGTH(name) AS name_length,
  CHAR_LENGTH(description) AS desc_length
FROM contacts;

Pattern Matching

-- LIKE (all databases)
SELECT * FROM users WHERE email LIKE '%@gmail.com';

-- SIMILAR TO (PostgreSQL)
SELECT * FROM products WHERE sku SIMILAR TO '[A-Z]{3}-[0-9]{4}';

-- REGEXP (MySQL) / ~ (PostgreSQL)
SELECT * FROM logs WHERE message ~ 'ERROR|FATAL';

String Aggregation

-- PostgreSQL
SELECT department, STRING_AGG(name, ', ' ORDER BY name) AS members
FROM employees GROUP BY department;

-- MySQL
SELECT department, GROUP_CONCAT(name ORDER BY name SEPARATOR ', ')
FROM employees GROUP BY department;

Best Practices

  • Use TRIM on user inputs to remove accidental whitespace
  • Store emails in lowercase for consistent comparison
  • Avoid using string functions on indexed columns in WHERE clauses
  • Use database collation settings for locale-aware string comparisons
  • For full-text search, use dedicated indexes (tsvector in PostgreSQL, FULLTEXT in MySQL) instead of LIKE with leading wildcards

String functions are the essential tools for data cleaning, formatting output, and building search functionality directly in SQL queries.

Use Case

A data cleaning pipeline standardizes customer records by trimming whitespace, normalizing email addresses to lowercase, and extracting domain names from email fields for analytics grouping.

Try It — SQL Formatter

Open full tool