May 9, 2008 at 12:30
· Filed under Operating Systems, Programming
Marc Balmer (of the OpenBSD Project) investigated reports of weird filesystem behaviour and found a 25-year old bug in the BSD libc implementation of readdir().
The fix should be in the trunk of all BSDs now and scheduled for merges or backports soon (e.g. see http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/readdir.c revision 1.15’s diff).
Tags:
bsd,
freebsd,
mac os x,
netbsd,
openbsd
Permalink
April 13, 2008 at 12:14
· Filed under Programming, Security
OpenSSH has a fantastic feature called ControlMaster. Basically this option allows you to create a socket that will share your already opened ssh session to the same host. To enable this option for all you put the following snippet in your $HOME/.ssh/config after creating something like $HOME/.ssh/sockets:
Host *
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h:%p
For every username@host:port it will create a socket in $HOME/.ssh/sockets. The only problem is that current Subversion (1.4.6 on my FreeBSD box) cannot work well with control sockets when using the svn+ssh:// URI identifier. In order to work around this problem you can add a specific host before the wildcard entry, for example:
Host svn.example.com
ControlMaster no
Host *
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h:%p
Of course, doing it like this is a bit tedious for every Subversion repository you use in this manner. Thankfully there is another way to do this. In $HOME/.subversion/config there is a section called [tunnels]. If you add the following entry to that section it will disable the ControlMaster:
[tunnels]
ssh = ssh -o ControlMaster=no
Tags:
openssh,
subversion,
svn
Permalink
April 12, 2008 at 19:10
· Filed under Programming, Python
So after yesterday’s post about some compiler results with Python 2.6 I wanted to show how some of GCC’s architecture-specific compiler flags affect the execution of pybench. As I explained in comments I think most people will never even touch the flags passed to Python’s build. Nonetheless, some people asked if I had tuned it in any way. Pádraig Brady had asked me if I had used the optimal GCC architecture flags. On my FreeBSD 7.0-STABLE machine at home (AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ (2411.13-MHz K8-class CPU)) his script stated I had to pass along “-m32 -march=k8 -mfpmath=sse”. My machine is fully 64 bits so I left out the -m32 (since it will not link anyway) and used “-march=k8 -mfpmath=sse” (using -march=native instead of k8 resulted in a 0,1 seconds faster result and -mtune=native -march=native instead of k8 resulted in a 0,1 - 0,2 seconds faster result).
The default option flags are on my system: -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes.
Considering some other comments about how I did not use a 0-origin for my y-axis I have to point out two things: firstly, given the sometimes close results zooming out too much can eliminate detailed information (of course you have to be careful not to zoom in too much as well); secondly, I like to make sure the graph itself is appropriately centered so you do not get a whitespace skewing in the resulting image. I think, being a follower of the Edward Tufte school of graphic displaying, I did reasonably well. The graphs were made with a tool called Ploticus.

I was curious how the optimization level influenced the resulting program and as such I removed the -O3 option from the compiler flags. As is evident from the graph you are looking at a bit more than a doubling of execution time (an average of 14,2 seconds versus the previous 6,6 and 6,5 seconds).

So, given the huge performance hit by merely leaving out the -O3, I was interested how the other optimization levels worked out. Holger Hoffstätte asked to use -O2 -fomit-frame-pointer instead of -O3. Basically the results of -O3 (average of 6,5 seconds) and -O2 -fomit-frame-pointer (average of 6,5 seconds) were equal. The result of using -O1 (I could not really discern much of a speed difference by adding -fomit-frame-pointer, also for the -O2 case it was still an average of 6,5 seconds) was quite interesting. It already improves execution by ~86%. From -O1 to -O2/-O3 we are looking at another increase of ~16%. From the no optimization case to -O2/-O3 execution improves by ~118%

I tried a profile-guided optimization build, but I have some issues on my FreeBSD 7.0-STABLE with libgcov. Apparently only a libgconv.a is provided and linking gives me a relocation warning. Thankfully I also had a GCC 4.2.4 snapshot from March installed and did a PGO build, but I managed to only shave of about 0,2 seconds on the average time.
Tags:
benchmark,
compiler,
edward tufte,
gcc,
python 2.6
Permalink
April 11, 2008 at 16:43
· Filed under Programming, Python
Due to recent concerns with memory use and execution speed I was curious how Python would behave with different compilers. I took Python 2.6a2 r62288 from the Subversion repository and compiled it with the flags: –with-threads –enable-unicode=ucs4 –enable-ipv6. The machine is a HP dc7700p with 1GB memory with an Intel Core2 6300 @ 1.86GHz running Ubuntu 7.10. I installed GCC 3.3.6, 3.4.6, 4.1.3, 4.2.1 from the Gutsy repository, and Intel 10.1.015. The MS Visual Studio 2008 Python was the MSI snapshot of 2008-04-10 from the main Python site. I ran this through Wine 0.9.46 after installing the VC2008 runtime.
First various GCC versions: 3.3.6, 3.4.6, 4.1.3, 4.2.1:

It is good to see that the 3.4 series is faster than the 3.3 series and the 4.2 series is faster than the 4.1 series. I am a bit worried about the 4.1 series drop in performance compared to the 3 series though.
Next we have Python compiled with GCC 3.4.6, 4.2.1, Intel CC 10.1.015, MSC from Visual Studio 2008:

It is nice to see how the Microsoft Visual Studio 2008 compiler produces a binary that, when run through Wine, still performs quite well compared to GCC. I am not quite sure if Wine incurs a performance penalty or not. What’s quite impressive is the performance of the Intel CC compiled Python. If we take the fastest GCC, which is 4.2.1 at the moment, take the average of the 10 rounds of execution, which is 6,574 seconds, and compare that to the average of ICC, which is 5,412 seconds, we see that ICC is about 21% faster. If we take the slowest, GCC 4.1.3 with an average of 7,002 seconds, we even get a result that ICC is about 29% faster.
So it seems for people who want to get the full performance out of Python compiling with ICC might be quite beneficial. I want to check out how ICC progressed from version 8 to version 10 performance-wise.
The raw data can be found at http://www.in-nomine.org/~asmodai/python-pybench.txt.
Tags:
benchmark,
compiler,
gcc,
icc,
python 2.6,
visual studio
Permalink
February 6, 2008 at 22:56
· Filed under Programming, Python
For the past few months there’s been a certain vibe building up. This vibe is coming from parts of the Python community. As it stands 2008 seems to become a very stellar year for Python.
Just after New Year TIOBE reported this:
Python has been declared as programming language of 2007. It was a close finish, but in the end Python appeared to have the largest increase in ratings in one year time (2.04%). There is no clear reason why Python made this huge jump in 2007. Last month Python surpassed Perl for the first time in history, which is an indication that Python has become the “de facto” glue language at system level. It is especially beloved by system administrators and build managers. Chances are high that Python’s star will rise further in 2008, thanks to the upcoming release of Python 3.
There are a lot of really interesting developments going on. Some interesting developments in my opinion are (in no particular order): Babel, Bitten, Genshi, Trac, Werkzeug, WebOb.
An exciting year indeed.
Tags:
babel,
bitten,
genshi,
trac,
webob,
werkzeug
Permalink
December 21, 2007 at 13:48
· Filed under Programming, Python
Armin Ronacher has released Werkzeug 0.1 a little while ago. As the website for Werkzeug says: “Werkzeug is a collection of various utilities for WSGI applications. It features request and response objects as well as a powerful url dispatcher and a debugging system.”
I have it on my todo list to convert my current CherryPy environment to Werkzeug as a proof of concept and see which one of the two I prefer and will ultimately use for my Japanese-Dutch dictionary project.
Tags:
cherrypy,
werkzeug,
wsgi
Permalink
October 26, 2004 at 22:05
· Filed under Programming
Sid is quite the powerful LL(1) parser, but my god, do its internals need some major clean-up and overhaul. Functions with 14 arguments?! That’s just asking for trouble.
Going to do the entire de-OSSG routine right now… Slow progress. At least the -y flag to tcc has been committed and now we need to get the apis correctly build and installed.
We’re getting there…
Tags:
c,
ll(1),
parser,
refactoring,
tendra
Permalink
October 18, 2004 at 23:59
· Filed under Programming
TenDRA has received a lot of OSSG clean up thus far, my lisp to python conversion working wonders on the sources.
Right now I am almost 3/5-4/5ths through the diffs of Amos’ work on tccenv/-y changes. Hopefully commit this this week.
Tags:
c,
Programming,
tendra
Permalink
September 14, 2004 at 10:39
· Filed under Hardware, Programming
Intel assembly, let me count the ways I loathe thee…
Tags:
assembly,
intel,
work
Permalink
July 23, 2004 at 14:09
· Filed under Books, Languages, Programming
Received two books yesterday:
Donald Knuth’s volume 3 of the Art of Computer Programming (Sorting and Searching) and
Ken Lunde’s CJKV Information Processing.
Donald Knuth’s volume 2 should be coming my way next week.
Tags:
art of computer programming,
chinese,
donald knuth,
i18n,
japanese,
korean,
unicode,
vietnamese
Permalink