7 pat t e r n m at c h I n g w I t h r e g u L a r e X p r e s s I o n s


Finding Patterns of Text with Regular Expressions



Yüklə 397,03 Kb.
Pdf görüntüsü
səhifə3/25
tarix29.11.2022
ölçüsü397,03 Kb.
#71308
1   2   3   4   5   6   7   8   9   ...   25
P A T T E R N M A T C H I N G W I T H

Finding Patterns of Text with Regular Expressions
The previous phone number–finding program works, but it uses a lot of 
code to do something limited: the 
isPhoneNumber()
function is 17 lines but 
can find only one pattern of phone numbers. What about a phone number 
formatted like 415.555.4242 or (415) 555-4242? What if the phone num-
ber had an extension, like 415-555-4242 x99? The 
isPhoneNumber()
function 
would fail to validate them. You could add yet more code for these addi-
tional patterns, but there is an easier way.
Regular expressions, called regexes for short, are descriptions for a pat-
tern of text. For example, a 
\d
in a regex stands for a digit character—that 
is, any single numeral from 0 to 9. The regex 
\d\d\d-\d\d\d-\d\d\d\d
is used 
by Python to match the same text pattern the previous 
isPhoneNumber()
function did: a string of three numbers, a hyphen, three more numbers, 
another hyphen, and four numbers. Any other string would not match the 
\d\d\d-\d\d\d-\d\d\d\d
regex.
But regular expressions can be much more sophisticated. For example, 
adding a 
3
in braces (
{3}
) after a pattern is like saying, “Match this pattern 
three times.” So the slightly shorter regex 
\d{3}-\d{3}-\d{4}
also matches the 
correct phone number format.


Pattern Matching with Regular Expressions
165
Creating Regex Objects 
All the regex functions in Python are in the 
re
module. Enter the following 
into the interactive shell to import this module:
>>> import re
N O T E
 
Most of the examples in this chapter will require the 
re
 module, so remember to import 
it at the beginning of any script you write or any time you restart Mu. Otherwise, 
you’ll get a 
NameError: name 're' is not defined
 error message.
Passing a string value representing your regular expression to 
re.compile()
returns a 
Regex
pattern object (or simply, a 
Regex
object).
To create a 
Regex
object that matches the phone number pattern, enter 
the following into the interactive shell. (Remember that 
\d
means “a digit 
character” and 
\d\d\d-\d\d\d-\d\d\d\d
is the regular expression for a phone 
number pattern.)
>>> phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
Now the 
phoneNumRegex
variable contains a 
Regex
object. 

Yüklə 397,03 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   25




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin