Contents

VALUE ~ '^[\u0600-\u06FF\s]+$'

This is a regular expression check used inside a DOMAIN or CHECK constraint.

Big picture

“Only allow Arabic-script characters (Pashto/Arabic/etc.) and spaces — nothing else.”

Piece-by-piece explanation

1️⃣ VALUE
  • In a DOMAIN, VALUE means:

“The value being inserted or updated”

Example:

INSERT INTO pashto_words VALUES ('ښکلی');

Here:

VALUE = 'ښکلی'

2️⃣ ~

  • This is PostgreSQL’s regex match operator

  • Means: “matches the regular expression”

Operator

Meaning

~

matches (case-sensitive)

!~

does NOT match

~*

matches (case-insensitive)


3️⃣ ‘^[u0600-u06FFs]+$’ → the REGEX

Now the important part

Regex explained visually

^        start of string
[ ... ] allowed characters
+        one or more
$        end of string

🔤 u0600-u06FF

This is a Unicode range.

It includes:
  • Arabic letters

  • Pashto letters

  • Persian letters

  • Urdu letters

  • Arabic punctuation

✅ Examples inside this range:

ا ب ت ث
پ چ ژ ړ ږ ښ ځ ټ

❌ Not included:

A B C
1 2 3
@ # !

🔹 s
  • Means whitespace

  • Includes:

  • space

  • tab

  • newline

So this allows:

"دا ښه کتاب دی"

🔹 [ … ] (character class)

[\u0600-\u06FF\s]
Means:
“One character that is either:
  • Arabic/Pashto Unicode

  • OR whitespace”


🔹 +

+    one or more

✔ At least one character required

❌ Empty string rejected


🔹 ^ and $ (anchors)

Symbol

Meaning

^

start of string

$

end of string

Together they mean:

“The ENTIRE value must match — not just part of it.”


✅ What PASSES this rule

Value

Why

سلام

Pashto letters

ښکلی کتاب

Letters + space

افغانستان

Arabic script

دا ښه دی

Valid Pashto


❌ What FAILS this rule

Value

Why

hello

Latin letters

سلام123

Numbers

سلام!

Punctuation

test سلام

Mixed scripts

‘’

Empty


Example domain using it

CREATE DOMAIN pashto_text AS TEXT
CHECK (VALUE ~ '^[\u0600-\u06FF\s]+$');

Why this is IMPORTANT for Pashto

Pashto is:
  • Right-to-left

  • Arabic script

  • Has extra letters

  • Must not mix Latin characters accidentally

This rule ensures:
  • ✅ Clean data

  • ✅ Correct language

  • ✅ No garbage text

  • ✅ No accidental English input


⚠️ Important notes
  1. This checks characters, not grammar

  2. It does NOT sort — that’s collation’s job

  3. Emoji ❌ (not in this range)

  4. Arabic numbers (١٢٣) ❌ unless added

Advanced version (allow Arabic digits too)

^[\u0600-\u06FF\u0660-\u0669\s]+$

Summary

VALUE ~ ‘^[u0600-u06FFs]+$’

forces the column to contain ONLY Arabic-script text (Pashto) and spaces — nothing else.