Python · SQL · Web Dev · Java · AI/ML tracks launching soon — your one platform for all of IT
Intermediate

String Functions

Every text manipulation tool — concatenation, case, length, trimming, substrings, replacement, padding, splitting, and pattern matching for real data cleaning

14–18 min April 2026
Section 9 · SQL Functions
SQL Functions · 4 modulesModule 41

// Part 01

Why String Functions Matter in Real Data

In an ideal world, every string column is clean, consistently formatted, and ready to query. In production, data arrives from mobile apps, web forms, CSV imports, partner APIs, and legacy systems — each with its own formatting quirks. Customer names with extra spaces, email addresses in mixed case, phone numbers formatted a dozen different ways, product codes with inconsistent separators.

String functions are the tools that standardise, clean, extract, and transform text data at query time — without modifying the underlying rows. They are essential for data quality checks, display formatting, join key normalisation (so 'bangalore' matches 'Seattle'), and extracting structured information from unstructured text fields.

// Part 02

Concatenation — Joining Strings Together

The || operator and CONCAT()

Concatenation syntax — two approaches
-- PostgreSQL and DuckDB: || operator (SQL standard)
'Hello' || ' ' || 'World'     -- result: 'Hello World'

-- MySQL and most databases: CONCAT() function
CONCAT('Hello', ' ', 'World') -- result: 'Hello World'

-- Both work in DuckDB (this playground)
-- NULL propagation: NULL || 'anything' = NULL
-- CONCAT handles NULLs differently in MySQL (treats as empty string)

-- Safe concatenation with COALESCE:
COALESCE(first_name, '') || ' ' || COALESCE(last_name, '')
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 03

Case Functions — UPPER, LOWER, INITCAP

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
🎯 Pro Tip
Always normalise case before comparing or joining on string columns. WHERE LOWER(city) = 'bangalore' correctly matches 'Seattle', 'BANGALORE', and 'bangalore'. Without normalisation, a single inconsistency in the data breaks the match silently — no error, just missing rows.

// Part 04

Length Functions — Measuring Strings

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 05

Trimming — Removing Whitespace and Characters

Leading and trailing whitespace is one of the most common data quality issues — especially in data imported from CSV files or form submissions. TRIM removes it cleanly. Untrimmed strings cause join mismatches and incorrect comparisons silently.

TRIM variants
-- TRIM: removes leading AND trailing whitespace (default)
TRIM('  hello world  ')           -- 'hello world'

-- LTRIM: removes leading whitespace only
LTRIM('  hello world  ')          -- 'hello world  '

-- RTRIM: removes trailing whitespace only
RTRIM('  hello world  ')          -- '  hello world'

-- TRIM with specific character (remove specific chars not just spaces)
TRIM('.' FROM '...hello...')      -- 'hello'
TRIM(BOTH '0' FROM '00042000')    -- '42'   (removes leading/trailing zeros)
TRIM(LEADING '0' FROM '00042000') -- '42000' (removes only leading zeros)

-- Practical: normalise before joining
WHERE LOWER(TRIM(city)) = 'bangalore'
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 06

Substring Functions — Extracting Parts of Strings

SUBSTRING / SUBSTRportion of string
SUBSTRING(str, start_pos, length) -- 1-indexed

Extract a portion starting at start_pos for length characters. Negative start not supported in standard SQL.

LEFTfirst N characters
LEFT(str, n)

Returns the first n characters. LEFT('Seattle', 3) = 'Ban'

RIGHTlast N characters
RIGHT(str, n)

Returns the last n characters. RIGHT('ST001', 3) = '001'

SPLIT_PARTNth segment after splitting
SPLIT_PART(str, delimiter, n) -- 1-indexed

Splits by delimiter and returns the Nth segment. SPLIT_PART('a,b,c', ',', 2) = 'b'

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 07

String Search — POSITION, STRPOS, CONTAINS

Finding positions within strings
-- POSITION: find where a substring first appears (1-indexed, 0 = not found)
POSITION('@' IN 'user@gmail.com')          -- 5
POSITION('xyz' IN 'user@gmail.com')        -- 0 (not found)

-- STRPOS (PostgreSQL/DuckDB): same as POSITION, different syntax
STRPOS('user@gmail.com', '@')              -- 5

-- CONTAINS (DuckDB): returns TRUE/FALSE
CONTAINS('Seattle India', 'India')       -- true

-- Practical: extract everything before a delimiter
SUBSTRING(email, 1, POSITION('@' IN email) - 1)  -- username before @
SUBSTRING(email, POSITION('@' IN email) + 1)      -- domain after @
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 08

REPLACE and TRANSLATE — Substituting Characters

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
TRANSLATE — character-by-character substitution
-- TRANSLATE: replace each character in the FROM set with the corresponding
-- character in the TO set — one-to-one character mapping
TRANSLATE('Hello World', 'aeiou', '12345')  -- 'H2ll4 W4rld'
-- H→H, e→2, l→l, l→l, o→4, ' '→' ', W→W, o→4, r→r, l→l, d→d

-- Remove specific characters (TO is shorter than FROM — excess chars are deleted)
TRANSLATE('ST-001/A', '-/', '')             -- 'ST001A' (removes - and /)

-- Practical: clean phone numbers (remove all non-digit characters)
TRANSLATE(phone, '()-+ ', '')              -- removes (), -, +, space
-- Much cleaner than chaining multiple REPLACE calls
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 09

Padding — LPAD and RPAD

Padding adds characters to reach a target length — essential for generating fixed-width codes, formatting report columns for alignment, and generating zero-padded IDs.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 10

LIKE and Pattern Matching — Flexible String Filtering

LIKE with wildcards is the standard SQL pattern matching tool. % matches any sequence of characters (including none). _ matches exactly one character. For more powerful patterns, SIMILAR TO (PostgreSQL) and regular expressions provide regex capabilities.

LIKE wildcards
-- % = zero or more characters
-- _ = exactly one character

'Amul%'        -- starts with 'Amul'
'%Butter'      -- ends with 'Butter'
'%Milk%'       -- contains 'Milk' anywhere
'_mail.com'    -- any single char then 'mail.com'
'ST__1'        -- ST, any 2 chars, then 1

-- Case sensitivity: LIKE is case-sensitive in PostgreSQL
-- Use ILIKE for case-insensitive matching (PostgreSQL/DuckDB)
WHERE product_name ILIKE '%amul%'      -- matches 'Amul', 'AMUL', 'amul'
WHERE product_name LIKE '%Amul%'       -- matches only 'Amul' (case-sensitive)

-- Escape literal % or _: use ESCAPE clause
WHERE notes LIKE '%50%%' ESCAPE ''   -- contains literal '50%'
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 11

Regular Expressions — Powerful Pattern Matching

Regular expressions (regex) provide full pattern matching power — validating formats, extracting structured data from free text, and finding complex patterns. PostgreSQL uses ~ for regex match and ~* for case-insensitive match. DuckDB uses REGEXP_MATCHES.

Regex operators and functions
-- PostgreSQL / DuckDB regex:
-- ~   : matches regex (case-sensitive)
-- ~*  : matches regex (case-insensitive)
-- !~  : does NOT match
-- !~* : does NOT match (case-insensitive)

-- Validate email format (simplified)
WHERE email ~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$'

-- Validate Indian mobile number: starts with 6-9, 10 digits total
WHERE phone ~ '^[6-9][0-9]{9}$'

-- Validate 6-digit zip_code
WHERE zip_code ~ '^[0-9]{6}$'

-- Extract with REGEXP_EXTRACT (DuckDB)
REGEXP_EXTRACT(text, pattern, group)

-- Replace with REGEXP_REPLACE
REGEXP_REPLACE(string, pattern, replacement)
-- Replace all non-digit characters in phone:
REGEXP_REPLACE(phone, '[^0-9]', '', 'g')  -- 'g' = global (all occurrences)
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 12

Aggregating Strings — STRING_AGG and ARRAY_AGG

Standard aggregates collapse values to a single number. STRING_AGG collapses multiple string values into a single concatenated string — essential for building comma-separated lists, generating labels, and creating readable summaries.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 13

Real-World Data Cleaning Pipeline

In practice, string functions are combined into cleaning pipelines — multiple transformations chained together to take raw dirty data and produce standardised output ready for joining, reporting, or loading into a clean table.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 14

What This Looks Like at Work

You are a data engineer at Shopify. A partner has delivered a CSV of 50,000 seller records — names are in inconsistent case, phone numbers have various formats, city names have typos and whitespace, and emails are uncleaned. Before loading into the sellers table, you write a cleaning query that standardises every string field.

9:00 AM
Raw data sample arrives
Sample shows: 'RAHUL sharma', ' New York ', 'rahul@GMAIL.COM', '91-98765-43210' — every field needs cleaning.
9:20 AM
Build the cleaning pipeline
Chain string functions to produce a clean output SELECT that can be wrapped in INSERT INTO sellers.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
10:00 AM
Cleaning query reviewed and approved
The cleaning pipeline standardises all string fields consistently. The INSERT INTO sellers SELECT ... wraps this query and loads 50,000 clean records in under 3 seconds. Post-load validation confirms zero records with invalid email format or untrimmed whitespace.

🎯 Pro Tip

Always create a cleaning SELECT before writing the INSERT. Run the SELECT, spot-check 20 rows manually, and verify edge cases (names with special characters, NULL values, extremely short or long values). Only wrap in INSERT once the SELECT output looks correct. Cleaning bugs caught in SELECT cost nothing. Cleaning bugs discovered after INSERT cost a DELETE, a fix, and a re-insert.

// Part 15

Interview Prep — 5 Questions With Complete Answers

Q: How do you perform a case-insensitive string comparison in SQL?

Three approaches. First, normalise both sides using LOWER() or UPPER() before comparing: WHERE LOWER(city) = LOWER('bangalore'). This converts both values to the same case before the comparison, making it case-insensitive. Second, in PostgreSQL and DuckDB, use ILIKE instead of LIKE: WHERE city ILIKE 'bangalore' — ILIKE is the case-insensitive version of LIKE and supports wildcards. Third, for joins on string columns, normalise the join key: JOIN stores ON LOWER(TRIM(o.city)) = LOWER(TRIM(s.city)).

The most important practical application is join key normalisation. Two tables may store the same city as 'Seattle' and 'bangalore' — an INNER JOIN without normalisation would miss the match entirely. Using LOWER(TRIM(city)) on both sides ensures the join finds matches regardless of case or whitespace differences.

Performance consideration: applying LOWER() or TRIM() to a column in WHERE prevents the database from using an index on that column — because the index stores the original values, not the lowercased versions. For high-frequency case-insensitive searches on large tables, create a functional index: CREATE INDEX idx_customers_city_lower ON customers (LOWER(city)). Then WHERE LOWER(city) = 'bangalore' can use this index efficiently. Alternatively, enforce case normalisation at write time (store all city values in title case) so queries never need to apply LOWER().

Q: What is the difference between REPLACE and TRANSLATE?

REPLACE substitutes all occurrences of a specific substring with a replacement string: REPLACE(string, find_string, replace_string). It works on substrings of any length. REPLACE('Hello World', 'World', 'SQL') returns 'Hello SQL'. REPLACE can only handle one substitution pattern per call — to substitute multiple different patterns requires chaining multiple REPLACE calls.

TRANSLATE performs character-by-character substitution — each character in the FROM set is replaced by the corresponding character in the TO set: TRANSLATE(string, from_chars, to_chars). It processes the entire character-by-character mapping in a single pass. TRANSLATE('aeiou', 'aeiou', '12345') replaces a→1, e→2, i→3, o→4, u→5 in one call. If the TO set is shorter than the FROM set, characters with no corresponding TO character are deleted.

The practical choice: use REPLACE when substituting specific substrings or words — REPLACE(text, 'Ltd', 'Limited'). Use TRANSLATE when you need to substitute or remove many individual characters in one pass — TRANSLATE(phone, '()-+ ', '') removes five different characters in a single function call instead of five nested REPLACE calls. For phone number cleaning, TRANSLATE is significantly cleaner than REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(phone, '(', ''), ')', ''), '-', ''), '+', ''), ' ', '')).

Q: How would you extract the domain from an email address using SQL string functions?

Two approaches. Using SPLIT_PART: SPLIT_PART(email, '@', 2) splits by '@' and returns the second segment — the domain. For 'user@gmail.com' this returns 'gmail.com'. SPLIT_PART is the cleanest approach when the delimiter is a single consistent character.

Using SUBSTRING and POSITION: SUBSTRING(email, POSITION('@' IN email) + 1) starts extraction from one character after the '@' and takes everything to the end. POSITION('@' IN 'user@gmail.com') returns 5, so SUBSTRING starts from position 6, returning 'gmail.com'. This approach is more verbose but is portable across databases that may not support SPLIT_PART.

To further extract just the domain name without the TLD: SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 1) extracts 'gmail' from 'gmail.com'. To validate that the email has an @ before extracting: WHERE POSITION('@' IN email) > 0 ensures only valid emails are processed. To handle NULLs: SPLIT_PART(COALESCE(email, ''), '@', 2) returns empty string for NULL emails rather than NULL. In production data cleaning, always validate the email format first with LIKE '%@%.%' or a regex before attempting extraction — malformed emails can cause unexpected results.

Q: What does STRING_AGG do and how is it different from GROUP_CONCAT in MySQL?

STRING_AGG is an aggregate function that concatenates string values from multiple rows into a single string, with a specified separator between each value. It is the PostgreSQL and DuckDB syntax. STRING_AGG(product_name, ', ' ORDER BY product_name) collects all product_name values in the group, sorts them alphabetically, and joins them with ', '. The result is one string per group — for a GROUP BY store_id, each store gets one string listing all its products.

GROUP_CONCAT is the MySQL equivalent: GROUP_CONCAT(product_name ORDER BY product_name SEPARATOR ', '). The functionality is identical — both aggregate string values from multiple rows into one concatenated string with a separator. The syntax differs: STRING_AGG uses standard SQL aggregate function syntax with the separator as the second argument; GROUP_CONCAT uses MySQL's non-standard keyword syntax with SEPARATOR.

Important differences: STRING_AGG returns NULL when there are no rows in the group (consistent with SUM/AVG behaviour). GROUP_CONCAT returns NULL in the same case. STRING_AGG in PostgreSQL has a strict type requirement — the column must be text or castable to text. STRING_AGG supports ORDER BY inside the function to control the concatenation order. Both support DISTINCT to deduplicate values before aggregating: STRING_AGG(DISTINCT category, ', ') produces a deduplicated list. Both have length limits (PostgreSQL default 1GB; MySQL default 1024 bytes configurable with group_concat_max_len). For large result sets, consider whether aggregating to a string is the right approach or whether a separate rows-based result would be cleaner.

Q: How do you handle NULL values in string operations?

NULL propagates through most string operations — any expression involving NULL produces NULL. 'Hello' || NULL || 'World' = NULL (not 'HelloWorld'). LENGTH(NULL) = NULL. LOWER(NULL) = NULL. TRIM(NULL) = NULL. This is the correct SQL behaviour but it can cause surprising results when any input column contains NULLs.

COALESCE is the primary tool for handling NULLs in string operations: COALESCE(column, '') replaces NULL with an empty string, allowing concatenation to proceed. first_name || ' ' || COALESCE(middle_name, '') || ' ' || last_name builds a full name even when middle_name is NULL. COALESCE(city, 'Unknown') provides a display label for NULL city values.

CONCAT() in MySQL treats NULL as empty string rather than propagating it — CONCAT('Hello', NULL, 'World') = 'HelloWorld' in MySQL, unlike PostgreSQL's || which would return NULL. This makes CONCAT() safer for nullable columns in MySQL but masks potential data issues. CONCAT_WS (Concatenate With Separator) also skips NULL values entirely — CONCAT_WS(', ', city, state, country) with a NULL city produces 'state, country' rather than ', state, country'. For regex and pattern matching functions, NULL inputs always return NULL — wrap nullable columns with COALESCE before passing to REGEXP_REPLACE or REGEXP_EXTRACT. The consistent rule: use COALESCE to substitute a meaningful default (empty string, 'Unknown', 'N/A') before any string operation on a nullable column.

// Part 16

Errors You Will Hit — And Exactly Why They Happen

String comparison fails to match — 'Seattle' != 'bangalore' — joins miss rows

Cause: SQL string comparisons are case-sensitive in PostgreSQL and most databases. 'Seattle' and 'bangalore' are different values. A JOIN or WHERE using direct equality will miss rows where the case differs between the two columns being compared. This is especially common after bulk imports where different sources used different conventions.

Fix: Normalise both sides with LOWER() before comparing: WHERE LOWER(city) = 'bangalore' or JOIN ON LOWER(a.city) = LOWER(b.city). For PostgreSQL, ILIKE is the case-insensitive alternative to LIKE. Long-term fix: enforce consistent case at write time using CHECK constraints or triggers, so all city values are always title case. For frequent queries, create a functional index on LOWER(city) to maintain performance while using case-insensitive comparisons.

String concatenation returns NULL — full_name is NULL even though first_name and last_name are not

Cause: The || operator (and CONCAT in PostgreSQL) returns NULL if any operand is NULL. If middle_name is NULL, first_name || ' ' || middle_name || ' ' || last_name = NULL — the NULL in the middle propagates to the entire expression. Any NULL in the concatenation chain nullifies the whole result.

Fix: Wrap nullable columns with COALESCE: COALESCE(middle_name, '') or use CONCAT_WS which skips NULL values. For a full name with optional middle name: first_name || CASE WHEN middle_name IS NOT NULL THEN ' ' || middle_name ELSE '' END || ' ' || last_name. Or use CONCAT_WS(' ', first_name, middle_name, last_name) — CONCAT_WS skips NULLs and only inserts the separator between non-NULL values.

LIKE pattern not matching as expected — wildcard seems to be ignored

Cause: Three common causes: (1) Case sensitivity — LIKE 'bangalore%' does not match 'Seattle%' in PostgreSQL. (2) Whitespace — the value has leading/trailing spaces that the pattern does not account for. (3) Incorrect wildcard placement — LIKE 'Seattle' without any % or _ is an exact match, not a pattern. The value must exactly equal 'Seattle' for it to match.

Fix: Use ILIKE instead of LIKE for case-insensitive matching. Wrap the column with TRIM() before LIKE: WHERE TRIM(city) LIKE 'Seattle%'. Verify wildcards are in the right position: LIKE '%Seattle%' matches the string anywhere; LIKE 'Seattle%' only matches strings starting with 'Seattle'. Test the pattern in isolation: SELECT 'Seattle' LIKE 'bangalore%' — this returns FALSE without ILIKE, confirming the case-sensitivity issue.

SUBSTRING returns unexpected result — wrong characters extracted

Cause: SUBSTRING in SQL is 1-indexed (the first character is at position 1), not 0-indexed like most programming languages. SUBSTRING('Hello', 0, 3) may return different results than expected — position 0 is before the first character. Off-by-one errors are common when developers from Python/JavaScript backgrounds expect 0-based indexing.

Fix: Remember SQL string positions are 1-indexed: SUBSTRING('Hello', 1, 3) returns 'Hel' (3 characters starting from position 1). SUBSTRING('Hello', 2, 3) returns 'ell' (3 characters starting from position 2). Test with a known string: SELECT SUBSTRING('Hello World', 7, 5) should return 'World'. Use POSITION to find the starting position dynamically: SUBSTRING(email, POSITION('@' IN email) + 1) correctly extracts everything after the @.

REPLACE only removes the first occurrence — other occurrences remain

Cause: This is a MySQL-specific behaviour in some contexts, or confusion with programming language string replace behaviour. In standard SQL (PostgreSQL, DuckDB), REPLACE replaces ALL occurrences by default. In some string libraries in application code, replace() only replaces the first occurrence unless a global flag is used.

Fix: In PostgreSQL and DuckDB, REPLACE always replaces all occurrences: REPLACE('aababab', 'ab', 'X') returns 'aXXX' — all three occurrences replaced. If you are seeing only the first occurrence replaced, verify you are using the SQL REPLACE function (not application-level string replace) and that you are running the query on the correct database. For REGEXP_REPLACE, use the 'g' flag for global replacement: REGEXP_REPLACE(text, pattern, replacement, 'g').

Try It Yourself

Write a query that generates a clean customer directory from FreshCart's customers table. Each row should show: a display_id (format: 'CUST-' followed by zero-padded 6-digit customer_id), clean_name (first + last name in Title Case, trimmed), username (everything before @ in email, lowercased), domain (everything after @ in email, lowercased), city_display (city in Title Case), a customer_slug (lowercase first name + '.' + lowercase last name + '-' + customer_id, all spaces replaced with hyphens), and a short_label (first initial + '. ' + last name, e.g. 'A. Khan'). Order by clean_name. Only include customers whose email contains an @ symbol.

🎯 Key Takeaways

  • String functions transform text at query time without modifying stored data — use them for display formatting, data cleaning, join key normalisation, and validation.
  • Always normalise case before comparing or joining string columns: LOWER(city) = LOWER(other_city). Use ILIKE (PostgreSQL/DuckDB) for case-insensitive LIKE matching.
  • TRIM removes leading and trailing whitespace. Always TRIM before comparing or joining on string columns — a trailing space makes 'Seattle' != 'Seattle ' causing silent join misses.
  • NULL propagates through string operations: 'Hello' || NULL = NULL. Wrap nullable columns with COALESCE before concatenation. CONCAT_WS skips NULLs automatically.
  • REPLACE substitutes substrings. TRANSLATE substitutes individual characters in one pass — use TRANSLATE for cleaning multiple different characters (phone number symbols) instead of chaining REPLACE.
  • SQL strings are 1-indexed: SUBSTRING(str, 1, 3) returns the first 3 characters. SPLIT_PART(str, delimiter, n) returns the nth segment (1-indexed).
  • LIKE uses % (any characters) and _ (one character). ILIKE is the case-insensitive variant. Applying LIKE/LOWER on indexed columns prevents index use — create functional indexes for frequent case-insensitive searches.
  • STRING_AGG(column, separator ORDER BY col) aggregates multiple rows into one concatenated string — essential for building comma-separated lists and human-readable summaries.
  • The standard cleaning chain: INITCAP(LOWER(TRIM(column))) produces consistently formatted Title Case from any input regardless of original capitalisation or whitespace.
  • Build cleaning pipelines as SELECT first, verify with spot checks on 20+ rows, then wrap in INSERT. Cleaning bugs caught in SELECT cost nothing. Bugs found after INSERT require DELETE + reclean + reinsert.

What comes next

In Module 42, you learn date and time functions — extracting parts, calculating differences, formatting, truncating to periods, and every temporal operation needed for time-series analytics and reporting.

Module 42 → Date and Time Functions
Share

Discussion

0

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub
Loading...