Pick up some Python

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

ANALYSIS
A client engagement surfaced a few months ago that called for me to work with Python. I figured the easiest way for me to get up to speed would be to apply my knowledge of a similar language and translate an existing script to Python. After some investigation, I found out that Python was somewhat similar to Perl, a language I know fairly well. I'm not going to show how I did the conversion but rather walk through the Python version of this script and illustrate the key statements that are used. The script I chose imports a delimited flat file containing 3,000 inventory items, extracts the item description field (which is variable-length) and converts it into three 30-character fields, and rewrites the file. You can see the script in its entirety in Listing A. Note In Python, loops and flow control statements aren't terminated, which can get a little confusing. You'll notice that I've added comments to indicate where code blocks are terminated. This helps me better organize and read my code. Here we go. import string #use string library
import re #use regular expression library These first two lines import the string and regular expression classes. Python is fairly object-oriented and allows for classes to be imported, increasing the expandability and modularity of one's code. inputfile = "c:\Work\CNET\Inventory.txt"      #set "inputfile" to be the name of the delimitated file. outputfile = "c:\Work\CNET\inv.txt"     #set "outputfile" to the name of the output file. This first fragment shows how to define a string variable in Python (e.g., variable_name = "string"). Python statements are terminated at the end of each line. Comments begin with a pound sign (#). f = open(inputfile)             #open "inputfile" for reading o = open(outputfile, "w")      #open "outputfile" for writing These two statements create the file handles necessary for importing and exporting the two files. A file handle is simply a data structure that Python uses to access external files. When the open statement is used with only one argument, the file is opened for reading only; the "w" indicates that the file is opened for writing. while (1):                     #process the input file
    offset =0                  #the first 30 charater field offset
    line = f.readline()        #assign the current line to "line"
   if not line : break         #exit the while loop at the end of the file The while statement will execute the code contained within it until the condition is false. I used 1 because I chose to use the if statement to exit the while loop from inside. Python uses only indentation to block code; this requires you to pay close attention to what you are doing but helps ensure readable code. The f.readline() statement uses the readline method for file handle objects. The method returns a string containing the current line of the file and moves the pointer to the next line. When the last line of the file is read, the pointer is null. The statement if not line : break is used to exit the loop because line will be equal to null at the end of the file, so the loop will exit accordingly.     line = line.rstrip()             #remove the newline character (and any ending whitespaces)
    cols = line.split('\t')          #split on tabs The rstrip method will remove any ending white space characters from the string object and return the new string. The split method takes one argument, the character to be split upon, and returns a list (or array) of strings. These methods were imported at the beginning of the script when the string class was imported. I think now would be a good time to also point out that object types are not differentiated syntactically in Python. The object type is simply defined at the time the variable is declared. splitme = cols[6]        #set "splitme" to be the data from the 7th column
splitup = list(splitme) #set "splitup" to be a list of characters from the string "splitme" The first line above shows how to point to an element in an array. The array is indexed from 0 to (n-1), where n is the number of elements. The second line demonstrates the list function, which takes a string as an argument and returns a list of characters. p = re.compile('\s') #compile a regular expression object "p" to find spaces. In this statement, I have used the regular expression class to create a regular expression object. The object must be "compiled" using the compile method. This method takes a regular expression as an argument that will be used in pattern matching. Here, '\s' is used, indicating that a space is the only thing being sought. Different expressions could be used to match elements such as white spaces, any alphanumeric character, or any numeral. if len(splitup) > 30: #if the item description contains more than 30 characters This statement introduces the len function. This function takes a list as its argument and returns an integer whose value is equal to the number of elements in the list. I think it's quite handy. for i in range(11): #count from 0 to 10 I found two interesting things when creating for loops in Python. The first is that for loops are iterated over a list of elements. I could have said for i in [0,1,2, 3, 4, 5, 6, 7, 8, 9, 10] but I chose the range function -- which is the second interesting thing I discovered. The range function generates a list of integers automatically. It takes one or two arguments. If one argument is given (n), a list from 0 to n-1 is generated. If two arguments are given (m,n), a list from m to n-1 is generated. Being able to iterate over lists is rather useful because you can iterate over a list of any object type. You could, for example, use a list of strings or a list of regular expression objects. m = p.match(splitup[(30 -i)]) #find the first space The match method of the regular expression class takes a character or string as its argument and compares it with the regular expression that was compiled. It returns true if the expression was matched.As I mentioned earlier, loops and flow control statements aren't terminated in Perl. In Listing B, you can see I've added comments to indicate where code blocks are terminated. newguy = string.join(splitup,'') + '\t'  #make the list a string The join method is the opposite of the split method. It takes the list of strings or characters to be joined as its first argument and a separator as the second. To concatenate two strings, the plus sign (+) is used. The portion of code in Listing C didn't change from the Perl version of the script.When the if condition preceding it is not met, the else conditional statement shown in Listing D is executed.In Listing E, the write method for file objects is used. The only argument passed to it is the string to be written to the file. Summary
Here's a recap of the elements I covered in this Python script:
  • Objects and classes
  • Variables (object and class types, scalars, and lists)
  • Flow control (while and for loops, and if/else statements)
  • Functions (e.g., len)
  • Methods of objects (e.g., join and split)
  • File I/O
Python is a fairly simple language to pick up. The conversion of the Perl script took about six hours. Check python.org for some great information. Its Windows download contains an editor and debugger, so writing, testing and executing code is easy and pleasurable. To have your say online click on the TalkBack button and go to the ZDNet forums.
Have your say instantly in the Tech Update forum. Find out what's where in the new Tech Update with our Guided Tour. Let the editors know what you think in the Mailroom.

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

UnderINK

I agree with the previous commenter wholeheartedly. I couldn't say it better myself. This is very 'Big Brother'. And while I agree with protecting...

28 minutes ago by UnderINK on European e-identity plan to be unveiled this month
Simon Bisson and Mary Branscombe

Nice to see that Turing's idea of a general purpose computer doing once-hardware-powered tasks in software is now universal ;-) Mary

6 hours ago by Simon Bisson and Mary Branscombe on Software with everything
Jason Burchell

seriously now. I've only bothered to read a small bit of the comments. do me and the rest of the world a favour. stop saying it does not work or...

10 hours ago by Jason Burchell via Facebook on Music industry negotiating over 24-bit downloads
Philip Charles Cohen

Read about it and weep, John Donahoe ... In addition to Visa’s V.me, there is now MasterCard’s PayPass digital wallet soon to arrive; another...

14 hours ago by Philip Charles Cohen via Facebook on PayPal takes phone-based payments to the high street
apexwm

Leslie Satenstein : Where have you ever seen Mozilla even mention this? Firefox is the most popular browser in the GNU/Linux OS, so I don't see...

15 hours ago by apexwm on Firefox rapid release improves Fedora Linux
songmaster

SHleG: Do you remember building a clockwork scorpion kit (I'm pretty sure I have a photo of it somewhere) — I think it was called something like...

16 hours ago by songmaster on Software with everything
Chris Wortman

Good I love Yahoo! Their search engine is getting better than Google as of late. I find more of what I want on the first page, and usually within...

17 hours ago by Chris Wortman via Facebook on Linux Mint 13 ramps up for KDE release
PatrickG

openhgs has made the point for Windows 8 multiple monitors without realising it! With Windows 7 you have to switch the mouse and so your focus...

19 hours ago by PatrickG on Windows 8 could speed multi-monitor uptake
Leslie Satenstein

Mozilla has threatened to stop supporting Linux. I guess that UBUNTU is going with another browser. I indicated that if Mozilla stops supporting...

20 hours ago by Leslie Satenstein via Facebook on Firefox rapid release improves Fedora Linux
Andy Bolstridge

Much as I abhor Microsoft's licensing practices, this is almost certainly down to purchasing IT equipment via 3rd party consultants - you get the...

20 hours ago by Andy Bolstridge via Facebook on 6 million wasted licences and £1,200 PCs: welcome to government IT
Jack Schofield

@openhgs Windows users have had multiple desktops since Linus started writing Linux. They just haven't shipped as standard because not enough...

2 days ago by Jack Schofield on Windows 8 could speed multi-monitor uptake
Jack Schofield

@Phil at Cloud4 What, Microsoft gets £1,200 per PC and £1,622 per server? Gosh, I'm amazed....

2 days ago by Jack Schofield on 6 million wasted licences and £1,200 PCs: welcome to government IT
craigsc

You guys have no idea what is going on at Autonomy. Autonomy could have been a much more profitable organization. The sales operations at Autonomy...

2 days ago by craigsc on HP cuts 27,000 staff as Autonomy chief Lynch leaves
Moley

How does this impact on dual or multi booting? Seems to me to more or less prohibit this, from Windows 8 anyway. Will Grub 2 recognise Windows 8,...

2 days ago by Moley on Windows 8 start-up speed forces USB boot workaround
apexwm

I don't understand why there cannot be a slight pause during the boot process so the user can press a key. Many operating systems do this, even if...

2 days ago by apexwm on Windows 8 start-up speed forces USB boot workaround
Gavin Goodman

You can now buy the Xi3 modular computer in the UK at http://www.ocdistribution.com . This can be bought with the Tand3m software, pricing and...

2 days ago by Gavin Goodman on CES 2012: Xi3 microSERV3R
Phil at Cloud4

I agree: Mike Lynch can clearly build a business and manage strategy. I suspect the exit of Mike is more likely the end of a planned handover...

2 days ago by Phil at Cloud4 on HP cuts 27,000 staff as Autonomy chief Lynch leaves
Phil at Cloud4

This is unbeleivable government wastage with only one winner... Microsoft 1 - Tax payer Nil!

2 days ago by Phil at Cloud4 on 6 million wasted licences and £1,200 PCs: welcome to government IT
Mispam

So what do you do when you can't boot into windows? Why can't I just hold Shift while I power up instead of having to boot into windows and click a...

2 days ago by Mispam on Windows 8 start-up speed forces USB boot workaround
apexwm

I've also seen that Mac OS X for Intel machines is supposed to run in VirtualBox, which would also be a nice solution. I've never tried it though.

2 days ago by apexwm on xTreme Triple Booting: Linux, Mac & Windows