Differences

This shows you the differences between two versions of the page.

--- python:twistedlxmlandre [2014/08/11 06:48] – [Custom Functions] admin
+++ python:twistedlxmlandre [2015/10/27 00:08] – [Regular Expression Language] admin
@@ Line 453: / Line 453: @@
 print ET.tostring(data)
 </code> output:<code>
+<data><country name="Liechtenstein"><rank>1</rank><year>2008</year></country><country name="Singapore"><rank>4</rank><year>2011</year></country></data>
+</code>
+  * Build xml using lxml.etree:<code python>
+from lxml import etree as ET
+'''
+<?xml version="1.0"?>
+<data>
+    <country name="Liechtenstein">
+        <rank>1</rank>
+        <year>2008</year>
+    </country>
+    <country name="Singapore">
+        <rank>4</rank>
+        <year>2011</year>
+    </country>
+</data>
+'''
+data = ET.Element('data')
+country1 = ET.SubElement(data, 'country', {'name':'Liechtenstein'})
+rank1 = ET.SubElement(country1, 'rank')
+rank1.text = '1'
+year1 = ET.SubElement(country1, 'year')
+year1.text = '2008'
+country2 = ET.SubElement(data, 'country', {'name':'Singapore'})
+rank2 = ET.SubElement(country2, 'rank')
+rank2.text = '4'
+year2 = ET.SubElement(country2, 'year')
+year2.text = '2011'
+print ET.tostring(data)
+</code> output: <code>
 <data><country name="Liechtenstein"><rank>1</rank><year>2008</year></country><country name="Singapore"><rank>4</rank><year>2011</year></country></data>
 </code>
@@ Line 495: / Line 527: @@
 </code>
 ===== re Package(Regular Expression) =====
+To use re package, we need to import it:<code python>
+import re
+</code>
 ==== Regular Expression Language ====
 A regular expression (abbreviated regex or regexp) is a sequence of characters that forms a search pattern\\
-refer: http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
+refer:
+  * http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
+  * python: https://docs.python.org/2/library/re.html#regular-expression-syntax
 **Match Character**
@@ Line 539: / Line 576: @@
 </code>
 === re.findall ===
-  * re.findall: The findall() is probably the single most powerful function in the re module<code python>
+findall: The findall() is probably the single most powerful function in the re module
+  - Example 1: <code python>
 str = 'purple [email protected], blah monkey [email protected] blah dishwasher'
@@ Line 548: / Line 586: @@
     # do something with each found email string
     print email
-</code>    <code python>
+</code>Understand pattern syntax above:
+  * [\w\.-]+ => Begin with one or multiple(sign: +) in group(sign: []): word(sign: \w) or character **.**(sign: \.) or character **-**
+  * @[\w\.-]+ => next of it is character @ and one or multiple characters in group: [word, **.** , **-**]
+  - Example 2: <code python>
 # Open file
 f = open('test.txt', 'r')
@@ Line 651: / Line 692: @@
 text2 = re.sub("cool", "good", text)
 print text2
+</code>output<code>
+Python for beginner is a very good website
 </code>
       * Here is another example (taken from Googles Python class ) which searches for all the email addresses, and changes them to keep the user (1) but have yo-yo-dyne.com as the host.<code python>
@@ Line 659: / Line 702: @@
 ## 1 is group(1), 2 group(2) in the replacement
-print re.sub(r'([w.-]+)@([w.-]+)', r'[email protected]', str)
+print re.sub(r'([\w.-]+)@([\w.-]+)', r'[email protected]', str)
 ## purple [email protected], blah monkey [email protected] blah dishwasher
+</code>output:<code>
+purple [email protected], blah monkey [email protected] blah dishwasher
 </code>
   * re.compile: With the re.compile() function we can compile pattern into pattern objects, which have methods for various operations such as searching for pattern matches or performing string substitutions.