Planet Larry

May 03, 2008

Zeth

Email Syntax Check in Python

Sometimes you may want to check that an email address is not syntactically invalid, i.e. it looks like a recognisable email address. I use this approach in my zetact contact form processor.

Of course, it does not mean the address actually leads anywhere, but at least you know are dealing with an email address that could exist.

This is the code I have been using, albeit I have changed it from a class method to a simple function to make this post simpler.

"""Email check using regex."""

def invalidreg(emailkey):
    """Email validation, checks for syntactically invalid email             
    courtesy of Mark Nenadov.                                               
    See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65215"""
    import re
    emailregex = "^.+\\@(\\[?)[a-zA-Z0-9\\-\\.]+\\.([a-zA-Z]{2,3}|[0-9]{1,3\
})(\\]?)$"
    if len(emailkey) > 7:
        if re.match(emailregex, emailkey) != None:
            return False
        return True
    else:
        return True

I decided it would be more Pythonic to try to do this using the built-in string methods, rather than importing the re module and using a monster regular expression. Here was my first attempt.

"""Email checks using string methods - simple version."""
def invalidemail(emailaddress):
    """Checks for a syntactically invalid email address."""
    try:
        emailitems = emailaddress.rsplit('@', 1)
        emailitems.extend(emailitems[1].rsplit('.', 1))
    except IndexError:
        return True

    if [x for x in emailitems if not x.replace(".","").isalnum()] \
            and emailaddress >= 7:
        return True
    else:
        return False

After a bit of testing and playing with this, a friend pointed me towards the relevant RFC on restrictions of email addresses. While the standard allows the use of many different special characters, in practice email addresses have to be much stricter if you actually want people in the real world to be able to send email to you.

For example, if we allow the email address []@commandline.org.uk, will whatever receives the output of this function be able to use it? As pointed out by Jan Goyvaerts, most software won't actually be able to handle obscure special characters.

We also don't want to water down the syntax check and allow junk for the sake of theoretical but non-existent addresses.

My compromise is to allow these special symbols -_.%+. in the local-part of the email address, and -_. in the domain name. I also do sanity checking on the top-level domain, it needs to be either a generic name or two characters long (country codes are all two letters).

So below is my current version, I added lots of comments and white space to make it easy to read.

"""Ditch nonsense email addresses."""

GENERIC_DOMAINS = "aero", "asia", "biz", "cat", "com", "coop", \
    "edu", "gov", "info", "int", "jobs", "mil", "mobi", "museum", \
    "name", "net", "org", "pro", "tel", "travel"

def invalid(emailaddress, domains = GENERIC_DOMAINS):
    """Checks for a syntactically invalid email address."""

    # Email address must be 7 characters in total. 
    if len(emailaddress) < 7:
        return True # Address too short.

    # Split up email address into parts.
    try:
        localpart, domainname = emailaddress.rsplit('@', 1)
        host, toplevel = domainname.rsplit('.', 1)
    except ValueError:
        return True # Address does not have enough parts. 
    
    # Check for Country code or Generic Domain.
    if len(toplevel) != 2 and toplevel not in domains:
        return True # Not a domain name.
    
    for i in '-_.%+.':
        localpart = localpart.replace(i, "")
    for i in '-_.':
        host = host.replace(i, "")
    
    if localpart.isalnum() and host.isalnum():
        return False # Email address is fine.
    else:
        return True # Email address has funny characters.
    
# Start the ball rolling.
if __name__ == "__main__":
    print invalid("warrior@example.com")

Discuss this post - Leave a comment

May 03, 2008 02:00 AM :: West Midlands, England  

May 02, 2008

Thomas Keller

Installing my new Brother MFC-7820N

I bought a Brother MFC-7820N; although I do not currently intend to use the fax capabilities, the device is good in all aspects, I assume. What especially impressed me is the Linux support Brother gives. The device is not attached to my server - instead, it is directly attached to my network via LAN (it comes [...]

May 02, 2008 06:37 AM

May 01, 2008

Nicolas Trangez

Python ‘all’ odity

[update] Question solved, see bottom of post.

Since Python 2.5 the language got a new built-in method ‘all’ (and it’s nephew ‘any’). I wanted to play around with this a little, combined with generators, so I created a little testcase to test performance.

Here’s the test-case: take a list L of X random numbers in a given range [A, B], and check whether

  • all elements in L are >= A
  • all elements in L are >= (A + Z) where Z is a number in [0, (B - A)]

The first test should always result True, the second test could result to False.

Here’s the output of a test-run:

In [1]: import random, sys

In [2]: a = [random.randint(100, sys.maxint) for i in xrange(2000000)]

In [3]: len(a)
Out[3]: 2000000

In [4]: #Check whether all elements are >= 100 

In [5]: %timeit all(i >= 100 for i in a)
10 loops, best of 3: 515 ms per loop

In [6]: %timeit any(i < 100 for i in a)
10 loops, best of 3: 454 ms per loop

In [7]: def f(l):
   ...:     for i in l:
   ...:         if i < 100:
   ...:             return False
   ...:     return True
   ...: 

In [8]: %timeit f(a)
10 loops, best of 3: 292 ms per loop

In [9]: #Same thing for 100000, since now the list shouldn't be completely iterated

In [10]: %timeit all(i >= 100000 for i in a)
100 loops, best of 3: 4.73 ms per loop

In [11]: %timeit any(i < 100000 for i in a)
100 loops, best of 3: 4.29 ms per loop

In [12]: def g(l):
   ....:     for i in l:
   ....:         if i < 100000:
   ....:             return False
   ....:     return True
   ....: 

In [13]: %timeit g(a)
100 loops, best of 3: 2.82 ms per loop

In [14]: #For reference

In [15]: %timeit False in (i >= 100 for i in a)
10 loops, best of 3: 531 ms per loop

In [16]: %timeit False in (i >= 100000 for i in a)
100 loops, best of 3: 5.03 ms per loop

It’s as if ‘all’, ‘any’ or ‘in’ don’t break/return when a first occurence of False (or True, obviously) is found. Is this the desired behaviour, and if it is, why? The calculation time difference between using all/any/in or a custom-made function (which is, unlike all etc, not written in C) which breaks whenever it can, is pretty astonishing.

[update] Question solved. It’s pretty normal the function-based approach performs better, since it combines what ‘all’ and the generator provided to ‘all’ do, taking away the generator function-call overhead. Damn :-)

May 01, 2008 01:57 PM

Martin Matusiak

renaming sequentially

If you’ve been dealing with files for a while you will have noticed that there is a slight semantic gap between how humans see files and how computers do. If you’ve ever seen a file list like this you know what I mean:

Lecture10.pdf
Lecture11.pdf
Lecture12.pdf
Lecture1.pdf
Lecture2.pdf

Numbering these files was done in good faith, and a user understands what it means, but the computer doesn’t get it. Sorting in dictionary order produces the wrong order as far as the user is concerned. The reason is that the digits in these filenames are not treated and compared as integers, merely as strings. (Actually, . comes before 0 in ASCII, what’s going on here?)

While we’re not expecting our computers to wisen up about this anytime soon, there is the obvious fix:

Lecture01.pdf
Lecture02.pdf

Lecture10.pdf
Lecture11.pdf
Lecture12.pdf

You’ve probably done this by hand once or twice, while cursing.

On the upshot, this is very easy to fix with a few lines of code:

#!/usr/bin/env python
#
# Author: Martin Matusiak <numerodix@gmail.com>
# Licensed under the GNU Public License, version 3.
#
# revision 1 - support multiple digit runs in filenames
 
import os, string, glob, re, sys
 
def renseq():
    if (len(sys.argv) != 2):
        print "Usage:\\t" + sys.argv[0] + " <num_digits>"
    else:
        ren_seq_files(sys.argv[1])
 
 
def ren_seq_files(num_digits):
    files = glob.glob("*")
    for filename in files:
        m = re.search("(.*)(\\..*)", filename)
        ext = ""
        if m: (filename, ext) = m.groups()
 
        digit_runs = re.finditer("([0-9]+)", filename)
        spans = [m.span() for m in digit_runs if digit_runs]
        if spans:
            spans.reverse()
            arr = list(filename)
            for (s, e) in spans:
                arr[s:e] = string.zfill(str( int(filename[s:e]) ), int(num_digits))
            os.rename(filename+ext, "".join(arr)+ext)

 
 
if __name__ == "__main__":
    renseq()

Download this code: renseq.py

This works on all the files in the current directory. Pass an integer to renseq.py and it will change all the numbers in a filename (if there are any) to the same numbers, padded with zeros if they have fewer digits than the amount you want. So on the example

renseq.py 2

will turn the first list into the second list.

If say, there are filenames with numbers of three digits and you pass 2 to renseq.py, the numbers will be preserved (so it’s not a destructive rename), you’ll just revert to your incorrect ordering as it was in the beginning.

renseq.py will rewrite all the numbers in a filename, but not the extension. So mp3 won’t become mp03. ;)

May 01, 2008 01:32 PM :: Utrecht, Netherlands  

Dieter Plaetinck

Windows sucks

I had to fix a problem at my dad's company...
"The network was broken."

It was a NetBEUI network connecting some windows stations - it has been running for years - and now suddenly the nodes couldn't find eachother.
One of the boxes (windows 2000 iirc) had 2 network cards, one for the network, the other not used for anything (not even connected). Disabling the latter - not even touching the former - fixed half of the network.

read more

May 01, 2008 12:22 PM :: Belgium  

Brian S. Stephan

Sender Policy Framework

Someone in #lh today told me about Sender Policy Framework, which sounds like a badly-needed enhancement to the Internet’s email protocols. Basically, the idea is to provide a DNS record that informs MTAs “don’t trust emails claiming to be from this domain unless they’re coming from one of my actual servers".

In DNS, this looks like (in my case):

emptymatter.org. IN TXT "v=spf1 a mx ~all"

Some MTAs support SPF but need to be configured, I believe Gentoo’s postfix is one of them. If I’m going to expect other mail servers to support it, I probably should myself. I’ll have to tackle that another day…

May 01, 2008 12:20 AM :: Wisconsin, USA  

April 30, 2008

Steven Oliver

steveno


Do you remember the last time I posted on here? I don’t LOL.

Regardless, this weekend I am planning a Gentoo install party. Sadly I will be the only one attending but since Paludis 0.26.1 is out I see no reason to delay the return of the King. Naturally I am the aforementioned king. No I’m really not that arrogant. I just play that way on the internet.

Enjoy the Penguins!

April 30, 2008 11:58 PM :: West Virginia, USA  

Christoph Bauer

Microsoft Delays Windows XP Service Pack 3

Since Heise announced that Microsoft will release the Windows XP Service Pack 3 on the 29th, I didn’t sleep too well, as I really want to grab it as soon as possible. Sure, I use Linux, but this doesn’t mean that I am not dangling with windows boxes from time to time and I am fed up applying about 100 patches before I can even think of security.

But I was laughing too soon - as just one day after the 29th (today), I have seen a posting on the Washington Post Blog that Microsoft has delayed the start of the service pack again. In a written statement they say:

“In order to make sure customers have the best possible experience we have decided to delay releasing Windows XP SP3 to Windows Update and Microsoft Download Center.”

In other words, there seems to be no release date yet. Well - in the meantime I’m rolling my own update pack using the Heise Offline Update. Thanks a ton, guys.


Copyright © 2007
Please note that this feed is for private use only. All other usage, including the distribution or reproduction of multiple copies, performance or otherwise use in a public way of the images or text require the authorization of the author.
(digitalfingerprint: 0f46ca51d0fa4e6588e24f0bf2b80fed)

April 30, 2008 06:52 AM :: Vorarlberg, Austria  

Brian S. Stephan

Yet another lazy post

Nothing exciting here. Got my tax refunds. Might build a home-made NAS with a couple terabytes of disk and put it in the basement.

On the DS, I’ve been playing Rondo of Swords and The World Ends With You. Rondo is a pleasant find, a difficult but still reasonable strategy RPG that makes one think and plan ahead, unlike games such as Revenant Wings which are much more “bring a healer and just mob everyone at the thing they’re strong against!” Also, I have a crush on Atlus by this point. There’s no denying it now. I draw their name with little hearts all around when I’m in meetings.

The World Ends With You is refreshingly original, one of those games that, even with it being Square Enix, is a bit surprising that it made it to the States. Very Japanese, and the game makes few concessions to the English audience. Sure, long gone are the times of gratitutous name changes, but even the j-pop/j-rock soundtrack remains intact, and that is, to my slightly jaded mind, a bit commendable. Now, if only the main character didn’t suffer from two vile Square Enix staples: unimaginable thinness and nearly sickening teenage angst. Neku is supposed to get better with the latter; I hope it is soon.

My games to beat are now Etrian Odyssey and Rondo of Swords, one I must beat before Etrian Odyssey II (guess which one) is released here, and the other before the Final Fantasy IV remake reaches the States. I’m excited. If I have time before those, Final Fantasy III and The World Ends With You are my RPGs to beat. FF3 is a cakewalk thus far, but its ease and its crude mechanics compared to Final Fantasy V make it hard to stay with for long.

I didn’t really intend this to become all about video games. I’ve been working on a Gentoo Wiki page for the HP 2133 which has kind of slowed down as most of the parts I’m interested in are supported as best they can be without new versions of drivers, I think. There’s some other hardware that I need to try out (the webcam, for example), but I don’t really care that much, so it’s low priority. Notebooky stuff works.

I have a Waterfield Designs bag coming soon, which I’m excited about. Don’t think it will be suitable for gaming books, but I still have that backpack which is going on 5+ years. The little trooper.

I’ve been meaning to survey the gaming group and associated friends to see what they’re using for IM these days. I think the answer for some is “nothing", with a couple saying “AIM on occasion” or “I idle on Google Talk", so I’ve not really been motivated to test those waters. I want to get a private Jabber conference room running for the group, since the IRC thing kind of sputtered off and died (I still idle there!), but I know it means getting people to switch to Jabber (or at least Google Talk) and then getting them to use a non-Google Talk client (Pidgin, I bet, but maybe Trillian would work). Sigh. If anyone has interest in switching to one network (I highly suggest a Jabber-like ["XMPP” for the techies]), or trying out conferences, or whatever, email/IM me and we’ll play around.

This really is getting rambly, and people might expect me to write long posts all the time. So I’m wrapping this up by saying that spring is finally here, and that’s why it snowed yesterday.

April 30, 2008 03:31 AM :: Wisconsin, USA  

April 29, 2008

Zeth

Three more tips - use keybindings, scripts and SSH without passwords

Use Readline shortcuts

At the bash prompt, you can use the default readline keybindings, these are similar to Emacs ones. Many of these are also available within other programs that use readline, such as the Python interpreter.

Here are some useful ones:

Ctrl-A Beginning of Line
Ctrl-E End of Line

Ctrl-U Kill (cut) everything left of cursor
Ctrl-K Kill (cut) everything right of cursor
Ctrl-W Kill (cut) the single word before the cursor
Ctrl-Y Yank (paste) the text back

Ctrl-L Clear Screen
Ctrl-D Exit
Ctrl-R Reverse interactive-search, (attempt to complete what is currently being typed using the history file)

SSH without Passwords

If you login to a remote machine often and you get bored of typing the password, then you can use public key cryptography instead.

The way it works is that the remote machine has a copy of your local machine's public key, it can then use that to check that your local machine is really your machine, and so let you in.

To start with, on the local machine, see if you already have a key pair:

ls ~/.ssh/id_?sa.pub

If not, then make one:

ssh-keygen -t dsa

Now you need to copy your public key to the remote host. On the local machine run:

scp ~/.ssh/id_?sa.pub remotehost:

Now we login to the remote server:

ssh remotehost

Append the public key to your authorized keys file

cat id_?sa.pub >> ~/.ssh/authorized_keys

Now you can login without passwords. Make sure the security of your machines is well thought out. Use disk encyption if possible.

Create a script directory in home directory

I often talk about random Python or bash scripts. The easy way to use them on Linux is to make a dedicated script directory for these.

mkdir ~/bin

Add it to your shell's path. Edit ~/.bashrc and add:

export PATH=$HOME/bin:$PATH

Now all the scripts that you add to ~/bin are always available. This makes things a lot more flexible and fun as you can try out various scripts by dropping them in ~/bin and then deleting them when you are bored of them.

Discuss this post - Leave a comment

April 29, 2008 09:00 PM :: West Midlands, England  

Jürgen Geuter

Firefox3 beta4 bug that annoys me

This Firefox3 beta4 bug is really annoying. Whenever I click a link on my email workspace, the browser window is pulled to the window. Really needs fixing (yeah I could write devilspie rules but I shouldn't have to).

April 29, 2008 12:10 PM :: Germany  

April 28, 2008

Jürgen Geuter

Bugmenot Extension for Firefox3 beta

I don't know when it was updated, but the bugmenot extension works with the current betas.

Bugmenot collects logins for those sites that require you to log in to read their content (like for example nytimes.com) and allows you to use those "throw-away" logins when you stumble on a page like that: You just rightclick the form and select "Log in with Bugmenot" and the extension will try the logins bugmenot has to get you into the page.

Terribly useful and didn't use to work with the firefox betas but now it does, which really rocks. If you have not used it so far, do it now.

April 28, 2008 10:08 PM :: Germany  

Dan Ballard

Setting up a remote git repository with just git

So Ubuntu hardy doesn't ship with the handy git wrapper/tool cogito because git has all the features in it incoperated... somewhere...

But documentation is surprisingly sparse. Anyways, if you want to set up a git repo nowadays using just git, it should go as something like follows:

root@server # cd /git
root@server # mkdir newrepo
root@server # chgrp git newrepo
root@server # chmod g+ws newrepo
root@server # cd newrepo
root@server # git init

And if this is a public repository

root@server # touch git-daemon-export-ok

On the client side.

user@client $ cd project
user@client $ git init
user@client $ git add *
user@client $ git commit -m "Initial code dump"
user@client $ git remote add origin ssh://user@git.server.com/git/reponame
user@client $ git push origin master

and after that regular

user@client $ git push

works just fine.

April 28, 2008 07:30 AM :: British Columbia, Canada  

April 27, 2008

Jürgen Geuter

Git and Windows

The fact that you are tied to a bad proprietary operating system does not mean that you have to live without good version control. I recently had to do some work on Windows (but I ran it in a VirtualBox) and of course I needed to connect to my git repositories.

Msysgit does not just give you access to the power of git but also makes your Windows experience a lot more bearable: It installs ssh and all the necessary GNU tools like rm, ls and others. You even get a functional terminal to replace Windows' clunky cmd.

Even if you don't need git you should probably install it on any Windows box to make sure to stay sane: Typing rm filename and getting an error message feels really weird and that is fixed then ;-)

EDIT: Typo fixed.

April 27, 2008 07:16 PM :: Germany  

Alex Bogak

Cellular Video Calls: reality that never happened?

Hi all

I recently started working for Comverse - the company supplies solutions for telephony providers, mainly cellular ones. Our product lies in the core of the operator's network and manages all or some of the services provided by the operator, such as Voice Mail, SMS, MMS, Video Calls, etc. Our system can provide a complete solution or integrate its parts with other available solutions in the market.

As I'm having an educational process now, I got an interesting thought during the studies. I got an insinuation from some of the cellular operators in last years, that video calls ability was the major drive behind the transition to fast networks, such as 3G, 3.5G and next generations. While it is true for some cases, I am not that sure that it is completely valuable.

Just think about it: would you perform a video call using the modern handset that has a video camera? Of course not - you'd have privacy issues right away. Do you really want the whole world to hear what you are saying? So what the point then in having fast network but not providing any type of service with it? Probably this is one of the reasons that cellular providers have problem: they have the infrastructure, but no services to monetize it. So everything else costs more to cover the losses. And this is something that I as consumer do not like.

I wonder why is it so in my locale that we do not have an unlimited connection cellular plans. We do have various packages, but they all are paid per minutes or MBs of data - just similar to what dial-ups used to be ages ago. It really would be great to have internet everywhere, and I think that cellular companies are not getting something here.

It's not that they make more money on pay per minute/byte basis. It's just me not buying the service at all while this is the payment scheme. So general users of this are business folks that gotta have an access to their email at all times. And even then, better options exist (we have WiFi hotspots almost everywhere now).

Just wonders of the world I guess.

April 27, 2008 03:16 PM :: Israel  

Matt Harrison

pyExcelerator (xlwt) cheatsheet (create native Excel from pure python)

If you are looking to generate/create/write (but not read) excel spreadsheets from native python (read on linux or macos or even Windows!), then xlwt (a fork of pyExcelerator) is your friend. This library, (and the ancestor) is somewhat unknown, but

April 27, 2008 06:37 AM :: Utah, USA  

Nirbheek Chauhan

<3 X, PulseAudio, and DAAP

So, right now, I'm sitting at my comp listening to Norah Jones. But this isn't like any other music-listening time. Right now, I'm:


  1. Logged into a lab computer via XDMCP: I could've used VNC, but that would've required someone to be logged-in on the lab comp.

  2. Using my laptop's PulseAudio as the lab computer's default PulseAudio sink: This makes the lab computer's PulseAudio send all sound to my laptop's PulseAudio by default.

  3. Connected to my laptop's DAAP share from the lab computer's Rhythmbox: The music on my laptop becomes accessible from the lab computer's Rhythmbox.



This setup results in me playing Norah Jones on the lab computer, and listening to it here :)

April 27, 2008 01:28 AM :: Uttar Pradesh, India  

April 26, 2008

Jürgen Geuter

"The shroud of the PU side has fallen. Begun, the Processor War has."

We, as todays computer users, are living at the brink of a war. A war that might change computing as we know it. Well it also might not, but I kinda liked that overly dramatic overture ;-)

The computers we use today are pretty much all just 8086 computers with delusions of grandeur. Processors are getting faster every few months and when technological barriers make tuning those CPUs to run even faster impossible, we just throw in more "cores". At least that's what Intel and AMD are doing and it's working.

I just got a new laptop recently which contains my first dual core processor and it really makes many things a lot faster, especially if you are as much into multitasking (as in having many programs run) as I am. But while processors get faster but we still wait pretty much as much as we have always done.

Software has gotten more complex, it offers more features, it's prettier, it automates more or it "wastes" CPU-cycles to offer more convenient ways to program. Whatever it is, software eats up what the processor builders give, old story.

Now Nvidia, namely Nvidia’s chief executive officer Jen-Hsun Huang and some guy named Roy Taylor (also working at Nvidia), claim that the CPU is dead:
"Basically the CPU is dead. Yes, that processor you see advertised everywhere from Intel. Its run out of steam. The fact is that it no longer makes anything run faster. You don’t need a fast one anymore. This is why AMD is in trouble and its why Intel are panicking. They are panicking so much that they have started attacking us. This is because you do still [need] one chip to get faster and faster – the GPU."


Are they right? Do Intel and AMD fear Nvidia the graphics chip company? Are they the ones who make your computer faster?

Well they do make your computer faster when it comes to games. All those shiny textures have to be arranged and lighted which is a really complex task. GPUs are optimized to do that, they beat any generic processor at it hands down. Because they are specialized on calculating graphics.

It's just like hammers: You can buy one hammer and probably do all the hammering you will ever need, but if you really have to hammer a lot you will buy specialized hammers that suit your need better. Maybe you need smaller or bigger heads, maybe your hammer needs to be lighter or the generic form doesn't suit your needs all that great.

This example also teaches us why those Nvidia fellas are wrong: Yes you need a certain level of hammerish-quality in your hammer to make it sufficient but for most people the generic hammer is good enough cause all they do is put a few nails into walls to hang pictures from.

Many, many people don't use their PCs to play games. They surf, they email, they calculate spreadsheets or write texts. They create images and designs, they code and create all those applications that you and me use every day. The point is: They don't do anything that requires sophisticated graphics hardware.

3D desktop effects are neat. Your graphics adapter having hardware acceleration for video playback so you can watch those blu-ray things without burning your CPU is kinda neat, but that's not really all that graphics intensive, the lower end of integrated graphics adapters can do that (or will do that soon).

I think the Nvidia guys are pretty much whistling while walking through the dark woods cause they know that they are in danger. Yes we could raise the graphics quality of games every two years but we are approaching a level of detail that is pretty much a plateau for a few reasons:

  • Some people just don't see the improvements anymore cause their monitor/TV is not good enough to actually display the finer details.

  • Adding even more detail makes creating those games more expensive. You need more people creating all that detail and man-hours can be quite expensive and add up fast.



People start to settle for worse looking games: The Nintendo DS sells like crazy cause there's just a great time to be had with that underpowered handheld. People play games on their ipods or cellphones and those games are generally on the lower end of the graphics spectrum because of the limitations of the hardware. The high-end graphics fetishist probably owns a dedicated gaming console with a hi-def television attached and might not even wanna bother with playing games in his office when he could play awesome looking games while being curled up on the couch.

Personally I think Nvidia might have it all wrong (well they actually know how it is but are claiming it's different): Graphics chips in PCs will lose importance because the budget-level onboard chips have gotten quite good enough for the usual needs. If they were so right we'd see specialized audio-chips pop up left and right (but even Windows Vista pretty much removed all audio hardware features in exchange for better software mixing [which means the CPU does all the work]). We'd see physics cards emerge (and I know that those things exist but who really has those apart from kids with too much money?). But we don't. Cause software emulation is good enough in most parts and where it's not the cheap graphics hardware has gotten good enough.

So probably we don't have a war at all. Maybe it's just a company trying to bait Intel into buying them for a big pile of money cause they know that if things continue to develop as they do they might get pushed out of the mainstream market into a niche. And niches don't always pay all that well.

April 26, 2008 09:12 PM :: Germany  

Martin Matusiak

download all media links on a webpage

This has probably happened to you. You come to a web page that has links to a bunch of pictures, or videos, or documents that you want to download. Not one or two, but all. How do you go about it? Personally, I use wget for anything that will take a while to download. It’s wonderful, accepts http, https, ftp etc, has options to resume and retry, it never fails. I could just use Firefox, and if it’s small files then I do just that, and click all the links in one fell swoop, then let them all download on their own. But if it’s larger files then it’s not practical. You don’t want to download 20 videos of 200mb each in parallel, that’s no good. If Firefox crashes within the next few hours (which it probably will) then you’ll likely end up with not even one file successfully downloaded. And Firefox doesn’t have a resume function (there is a button but it doesn’t do anything :rolleyes: ).

So there is a fallback option: copy all the links from Firefox and queue them up for wget: right click in document, Copy Link Location, right click in terminal window. This is painful and I last about 4-5 links before I get sick of it, download the web page and start parsing it instead. That always works, but I have to rig up a new chain of grep, sed, tr and xargs wget (or a for loop) for every page, I can never reuse that and so the effort doesn’t go a long way.

There is another option. I could use a Firefox extension for this, there are some of them for this purpose. But that too is fraught with pain. Some of them don’t work, some only work for some types of files, some still require some amount of manual effort to pick the right urls and so on, some of them don’t support resuming a download after Firefox crashes. Not to mention that every new extension slows down Firefox and adds another upgrade cycle you have to worry about. Want to run Firefox 3? Oh sorry, your download extension isn’t compatible. wget, in contrast, never stops working. Most limiting of all, these extensions aren’t Unix-y. They assume they know what you want, and they take you from start to end. There’s no way you can plug in grep somewhere in the chain to filter out things you don’t want, for example.

So the problem is eventually reduced to: how can I still use wget? Well, browsers being as lenient as they are, it’s difficult to guarantee that you can parse every page, but you can at least try. spiderfetch, whose name describes its function: spider a page for links and then fetch them, attacks the common scenario. You find a page that links to a bunch of media files. So you feed the url to spiderfetch. It will download the page and find all the links (as best it can). It will then download the files one by one. Internally, it uses wget, so you still get the desired functionality and the familiar output.

If the urls on the page require additional post-processing, say they are .asx files you have to download one by one, grab the mms:// url inside, and mplayer -dumpstream, you at least get the first half of the chain. (Unlikely scenario? If you wanted to download these freely available lectures on compilers from the University of Washington, you have little choice. You could even chain spiderfetch to do both: first spider the index page, download all the .asx files, then spider each .asx file for the mms:// url, print it to the screen and let mplayer take it from there. No more grep or sed. :) )

Features

  • Spiders the page for anything that looks like a url.
  • Ability to filter urls for a regular expression (keep in mind this is still Ruby’s regex, so .* to match any character, not * as in file globbing, (true|false) for choice and so on.)
  • Downloads all the urls serially, or just outputs to screen (with --dump) if you want to filter/sort/etc.
  • Can use an existing index file (with --useindex), but then if there are relative links among the urls, they will need post-processing, because the path of the index page on the server is not known after it has been stored locally.
  • Uses wget internally and relays its output as well. Supports http, https and ftp urls.
  • Semantics consistent with for url in urls; do wget $url… does not re-download completed files, resumes downloads, retries interrupted transfers.

Limitations

  • Not guaranteed to find every last url, although the matching is pretty lenient. If you can’t match a certain url you’re still stuck with grep and sed.
  • If you have to authenticate yourself somehow in the browser to be able to download your media files, spiderfetch won’t be able to download them (as with wget in general). However, all is not lost. If the urls are ftp or the web server uses simple authentication, you can still post-process them to: ftp://username:password@the.rest.of.the.url, same for http.

Download spiderfetch:

Recipes

To make the use a bit clearer, let’s see some concrete examples.

Recipe: Download the 2008 lectures from Fosdem:

spiderfetch.rb http://www.fosdem.org/2008/media/video 2008.*ogg

Here we use the pattern 2008.*ogg. If you first run spiderfetch with --dump, you’ll see that all the urls for the lectures in 2008 contain the string 2008. Further, all the video files have the extension ogg. And whatever characters come in between those two things, we don’t care.

Recipe: Download .asx => mms videos

Like it or not, sometimes you have to deal with ugly proprietary protocols. Video files exposed as .asx files are typically pointers to urls of the mms:// protocol. Microsoft calls them metafiles. This snippet illustrates how you can download them. First you spider for all the .asx urls, using the pattern \.asx$, which means “match on strings containing .asx as the last characters of the string”. Then we spider each of those urls for actual urls to video files, which begin with mms. And for each one we use mplayer -dumpstream to actually download the video.

#!/bin/bash
 
mypath=$(cd $(dirname $0); pwd)
webpage="$1"
 
for url in $($mypath/spiderfetch.rb $webpage "\\.asx$" --dump); do
	video=$($mypath/spiderfetch.rb $url "^mms" --dump)
	mplayer -dumpstream $video -dumpfile $(basename $video)
done
 

Download this code: asx_spiderfetch.sh

April 26, 2008 07:44 PM :: Utrecht, Netherlands  

Ian Monroe

Taiwan Day 0

Getting There
Yesterday I arrived in Taipei after about 22 hours of airplanes and airports. At least for a Missouri guy like me, it seems sparling and huge. Fellow Amarok dev and Sydney resident Seb Ruiz isn't so impressed. I didn't get lost on my way to the hostel until the very last block. I walked into a bakery probably looking quite lost (I had been walking around a single block for about 10 minutes) and one of the workers asked me "do you need help?" in English. I showed her the address and phone number of the hostel, she whipped out her mobile and called it, got directions and walked me to the entrance (about 20 meters away actually). So the Taiwanese are very friendly, Seb had a similar experience of someone actually taking him up the elevator directly to the hostel.

The hostel itself (the "Camel's Oasis"), is very homey. I found Seb chatting with a Québécoise about her 6-month world tour (you always find such lucky people at cheap hostels). I did follow Seb's advice and avoided sleeping much on the flight over here (only about 3 hours), but my biological clock still objected to the 11 hour timeshift; I had some pretty spotty sleep last night.

OpenTech Conference
Now we're at Asus for the OpenTech Summit. This morning is being devoted to Asus. Ellis, the product manager for the Asus EeePC (IIRC), presented on how Asus intends to work with the community. A Xandros developer, Brian, went over SDK for the Asus EeePC, which is basically Xandros distributed with Eclipse and Vmplayer. The main development environment they are supporting is Qt 4.2; he did a "Hello World" using Eclipse's Qt Designer integration. Both Ellis and Brian talked about the method for ISVs to release software for the Eee PC; this interests me since it would be really nice to have a way to release the newest Amarok's directly to the EeePC Add/Remove Programs system. The next version of the EeePC OS is apparently going to make it easy to install .deb's or an EeePC-specific tarball which contains a bunch of .deb files.




April 26, 2008 05:24 AM :: Missouri, USA  

April 25, 2008

Jürgen Geuter

The hand that feeds ...

I'm following quite a bunch of webcomics and Penny Arcade is in that list of comics (it's not great but every 10th or 20th comic is funny).

Well Penny Arcade can teach us a lesson about how to not do something: Prematurely updating the RSS feed.

I don't know PA's setup and whether they have a bunch of wenservers doing load-balancing or whether they just got one server but they have a really annoying issue: Their RSS feed updates before their website does. The new comic's entry is there but the actual image has not yet been uploaded for some reason so you see the new comic in your feed reader but when you want to view it, you get a comic-less page.

RSS feeds have gotten quite popular (and rightfully so!), I actually stopped reading webcomics without a feed cause going to all those websites manually was just too much waste of my time. RSS feeds allow a content provider to inform interested parties that something on his website changed, that he has produced new content and that he invites previous readers back to his page (or hers of course). If you consider RSS to be that kind of invitation it becomes obvious how careful you have to maintain your feed.

People and companies spend big cash to make sure their websites look great and represent them well. The same is true for business cards: You don't wanna hand out a business card that looks shabby, it will make your business look unprofessional and you'll lose (potential) customers and therefore money.

Your website's RSS feed is as important as your business card, as your "real" website, actually it's replacing your real website more and more. Having a broken or nonfunctional RSS feed is probably even worse than having none: If you have none some people might not check your site regularly but at least you don't look incompetent. Here are a few rules you should obey to make sure your feed doesn't make you look bad:


  • Don't plaster advertising all over your feed. Advertising on your website is bad enough (every time I have to surf without an adblocker I die a little inside), but littering your feed which is supposed to be an invitation to come to your page is like putting ads for your local burger place on your wedding invitation: Makes you look really bad.

  • Make sure your RSS feed is current: If your RSS feed lacks behind a few hours or publishes stories prematurely so people either get to your site and find information that might be outdated or (even worse!) is not yet there (remember the Penny Arcade example) annoys people. And annoyed people are often quite quick with the "unsubscribe from feed"-button which makes your site invisible to them

  • Don't castrate your feed: If your feed just contains the first 3 words of every article people will click on many of your links at first (bringing in a lot of that dirty advertising money) but after they have clicked 10 times and found something uninteresting 9 times they might start to be very hesitant to click your links anymore. If you are not considerate enough to offer the full content in your feed make sure to at least offer enough text in it so people can make an educated decision whether they want to read your article.

  • People subscribing to your feed invested some trust into you. They included your content into their personal information stream which is (in times like these with all the information overload we face each day) not a small gift to you. Every time you cheat people to come to your site with a sensationalist heading with the article not being able to live up to their expectations you spit on that trust. Never forget: If someone does not know you your rank with that person is neutral, but someone who had at some point subscribed to your feed and unsubscribed ranks you pretty low: People usually avoid deleting, they want to keep stuff around, even mediocre feeds (don't we all have some of those feeds around that are not actually good but we keep around for sentimental reasons?), so if they delete your feed you have really annoyed them. Don't let that happen cause they'll probably never come back.



The importance of RSS feeds and their look has not spread enough as we can see in many many bad feeds. Don't make the mistake of looking bad with your company or your personal website just cause you didn't think enough about your virtual "business card".

April 25, 2008 06:35 PM :: Germany  

Thinkpad

Buying a Thinkpad turned out to be a really great idea: Everything works flawlessly on linux without issues. Backlight is being dimmed when I'm idle, wireless works without any extra hoops to jump through, hibernate works. That's how it would be if finally all vendors would decide to get their heads out of their asses and provide open drivers or at least specs. What a glorious world that could be.

April 25, 2008 06:09 PM :: Germany  

Zeth

Twelve commandments for Beautiful Python code

Living Code

David Parker famously said that texts are living, once they leave the pen of the author then they have a life of their own, you never know where the text will end up or how it will be modified. For Python code that is even more true.

The beauty of Python is that you can write code fast, share code and modify code. For this to work, your code needs to be readable. Writing code is easy, reading other people's code is much harder, or even reading your own code after a few months or years has past.

Therefore the aim is to make code as readable as possible, even if it causes a little more work when you write it. The way to make your Python code most readable is to keep to the Style Guide for Python Code, also known as PEP8.

Pylint for the Win

It is far easier to keep your code valid to PEP8 as you go along, than to try to move a large codebase to PEP8 at the end. I recommend the use of a tool called pylint.

Pylint is available from all Linux distributions' package managers (e.g. apt-get install pylint or emerge pylint). Here are some instructions for Windows.

If you have ever made a webpage you probably know about HTML-tidy or the online W3C Validator tool. These tell you everything wrong with your HTML.

Pylint is similar, it goes through and tells you both syntax errors and also how your code differs from the PEP8 standard.

There are some corner cases in which you will need to give pylint the finger, but doing it consciously for good reason is better than because you are sloppy.

PEP8 is better than your crappy style

People often don't use PEP8. This is for a variety of (bad) reasons.

Firstly, sometimes people are tourists from another programming language, they do not know any better so they write their Python code like it was Java or C code.

Secondly, Sometimes people think their (cl)own style is better than PEP8 in some technical way. Well that does not matter. I might have a better way to design a plug socket, but if I implemented my better plug socket, I would not be able to buy any electrical devices.

There can only be one standard, and PEP8 is that standard. If you want to change that standard then bribe, sleep with or kill Guido Van Rossum.

Not following the standard makes your code less readable to others, this prevents the quick reuse that Python is designed for (see above).

If you are a free-software/open-source project, then you particularly should be ashamed if you write hard to read code, because allowing other people to read, understand and modify your code is the whole point.

Lastly, some people don't use PEP8 because the document is too circular and verbose for them to remember. I feel your pain, below are the main points in 12 easy rules.

The 12 commandments

Guido, who brought you out of the land of Visual Basic, out of the land of slavery, spake all these words to thee:

  • Module names should be in all lowercase - hello.py.
  • Top level variables (variables that are not in a function or class) should be in BLOCKCAPITALS.
  • Class names should be in CamelCase.
  • Methods and functions should be in lower_with_underscores
  • Implementation-specific 'private' methods _single_underscore_prefix
  • Especially private non-subclassable methods __double_underscore_prefix
  • If a variable inside a function or method is so temporary and disposable that you cannot give it a name, then use i for the first one, j for the second and k for third.
  • Indentation is four spaces per level. No tabs. If you break this rule then you must be stoned in the village square.
  • Lines are never more that 80 characters wide. Tip, break lines with a backward slash \. You do not need to do this if there are parentheses, brackets or braces. Don't add extra parentheses just to break lines, use \ instead.
  • Spaces after commas, (green, eggs, and, ham)
  • Spaces around operators i = i + 1
  • Write docstrings for all public modules, functions, classes, and methods. Python is an international community, so use English for docstrings, object names and comments. If you want to provide local translations then use a proper localisation library.

Discuss this post - Leave a comment

April 25, 2008 06:00 PM :: West Midlands, England  

Matt Harrison

Lasers/Wiimotes/Python presentation at Utah Code Camp

For those in the Utah vicinity, and who want to see a cool presentation about python, wiimotes and lasers (or who really like MS products, since most sessions seem dedicated to that), my brother is presenting at Utah Code Camp. It should be a little

April 25, 2008 03:19 PM :: Utah, USA  

[Pycon 2008] Lasers, Webcams, Wiimotes and Python Video up

My brother gave a talk about using python to detect lasers with webcams and also demo'd a prototype of a 3d game with wiimote headtracking. For those who missed the preso, there's now a video up. .flickr-photo { border: solid 2px #000000; } .fl

April 25, 2008 03:15 PM :: Utah, USA  

Jürgen Geuter

Steve Yegge is right in spite of the Mac crowd's belief

Steve Yegge wrote about "Focus Follows Mouse" today and how its lack on Apple's OSX makes him a very unhappy camper. I've always been a vocal supporter of "Focus Follows Mouse" and, just as Yegge, I hate "autoraise" that many people seem to think is identical to "Focus Follows Mouse" with a fiery passion.

What is funny is how, after he had admitted to have switched to OSX on all his client machines, the Mac crowd tried everything they could to defend their platform, even when he had shown that the ways they were going to suggest were wrong. He even had to close the comments ;-)

Was quite a funny read and his explanation on what "Focus Follows Mouse" is is quite good, so read it!

April 25, 2008 12:28 PM :: Germany  

April 24, 2008

Michael Klier

The Twitter Blacklist And Another Greasemonkey Script

If you, like me, use twitter on a regular basis, you maybe like this one.

There's a new great site around called The Twitter Blacklist. It was created by Earle Martin and intends to gather a list of all the spammers and morons who either try to use the service to promote their nonsense products/websites or simply just are attention addicts. In both cases, these people are blindly following as much other people as they can. The best indicator to see whether someone is a twitter spammer or not, is the ratio between how many people they follow and how many follow them.

1:5 = twittercaster, 1:2 = notable, 1:1 socially healthy, 2:1 newbie or social climber, 5:1 twitter spammer. - Evan Podromou

Since a couple of days the twitterblacklist has a simple, yet nice API which allows you to check if a certain user is listed or not. This is where Greasemonkey enters the game :-).

I wrote a tiny Greasemonkey script which looks up the username of the visited twitter profile and displays a nice warning message at the top of the page if it's listed at the twitter blacklist.

Blocking the user then, is just one click away 8-).

I made the script available at the userscript website, you can fetch it here.

I hope this also finds its way into some of the available twitter clients. If the twitter blacklist grows (which it does almost daily) it will make twitter a even nicer place to stay.

And last not least: If you know other twitter spammers which aren't listed at the twitter blacklist yet, remember to report them (details about how you can report a spammer can be found http://twitterblacklist.com).

Read or add comments to this article

April 24, 2008 08:01 PM :: Germany  

Luis Medinas

And i wonder....

Dear Gnome users... what new kind of application do you think GNOME needs on the desktop ?

I would like to see (maybe i'll start some new project) a gnu octave frontend or a somekind of an IDE for scipy/numpy using the excelent reinteract

April 24, 2008 05:41 PM

Nirbheek Chauhan

Google Summer of Code, Gentoo

Right after the GSoC results were announced, Anant Narayanan sent an email to the gentoo-soc ML welcoming the students with lots of good advice about how to proceed, what all they can expect, and what all they're expected to do. Thanks Anant!

The only thing about that email that irked me was that third party source code management systems such as code.google.com, sf.net, and repo.or.cz were recommended for hosting the source code. Now, for a small project that does not have much in the name of Infra, this would be acceptable, but for a full-fledged organisation with a dedicated infra team, this looks quite shoddy (this probably happened due to insufficient communication between gentoo-soc and gentoo-infra). And on top of that, projects getting distributed across several repositories makes it impossible to find the code during and after SoC is finished. For instance, I am completely unable to find the code for a lot of the SoC 2007 projects.

Now, I understand that Gentoo Infra is very short-staffed and overworked at the moment, and hosting dedicated Trac setups for all the students is not an easy task. So I poked my mentor Patrick Lauer and asked him if he could host Redmine at gentooexperimental which could then be used as a central place for tracking/hosting all the Gentoo SoC projects. He agreed, but his dislike of Rails meant that I would have to do the setup and manage it.

And so it was done, and an email sent to the list. soc.gentooexperimental.org now hosts Redmine for project management.

After a small chat with Donnie Berkholz on IRC, we agreed that hosting the source code under Gentoo Infra and using Redmine for the rest of the stuff would be best. OTOH, Alec Warner was in favour of giving the students full freedom with hosting their projects as long as the place of their choice was usable. I replied to his email suggesting that in the interest of keeping the projects accessible from one place, people who want to do their development somewhere else be asked to create a dummy project at soc.ge.o which points to the place where the actual development is taking place.

Let's see how things turn out.

April 24, 2008 11:39 AM :: Uttar Pradesh, India  

April 23, 2008

Alex Bogak

Oh Gentoo, what had become of thee?

Dear friends

Yesterday was an important day for me. I stumbled into a very important issue, albeit small, which made me to come to the following decision: I am leaving Gentoo as a desktop platform.

It does not come as an easy decision. I've been using Gentoo and quasi-actively participating in the community for about 5 years. I have it installed currently on 3 out of 4 computers I have (the last one being mac mini, which I keep with Mac OS X). So why would I take this decision?

It all began with a one simple thing. You may have read my previous posts on various WINE installations, and I use some Windows applications with WINE. But recently Internet Explorer stopped working. I've tried to reinstall it (and it is easy in Gentoo, just as in any other Linux distribution with decent package manager), but to no avail.

Next step was slightly more complicated, but still quite simple: I've used VMWare to install complete Window XP environment. It worked fine for awhile, until I couldn't use VM images between different computers I have. It just stopped working. Besides that, the performance of VMWare on my AMD Athlon 1.8 with 1G of memory was, to say the least, appalling. Next came Innotek (now Sun) VirtualBox. This is the best emulation environment I could find to work on my computer. It works fine, and I use it for all my Windows-related projects.

But as a side effect of all installations, system began breaking. I started noticing various weird things, such as sudden applications freezing at times, etc. Couple of days ago, when there were no applications running, I've seen CPU usage at ~80%, I did what most Windows users do. I rebooted the machine.

And then, system just broke. System utilities seemed nowhere to be found. Some init scripts seemed to be incorrect, etc. I somehow fixed the situation by copying old versions from other projects, and updating the system. But now, GNOME has problems with graphics and themes, and most applets do not work and even do not exist. It just never ends, does it?

So, as a normal user of Gentoo, I went to emerge my world. I haven't done that for a couple of months, so there were almost 1G of updates waiting for me. I've downloaded all the packages, and began the emerge.

The thing that broke the last straw was a simple apache update. The system update failed because I had an old version. Not because compile didn't work. Just because it needed me to manually do something!! It redirected me to a Gentoo doc site, which has 2 lines of code that fixed the problem, and emerge now runs again.

Why in the heavens name wasn't this done automatically? Why did I loose half a day, during which my system could be updated? I lost this time because update procedure stopped. I had to fix the Apache configuration, so my GNOME desktop could continue updating. I understand that this specific issue with Apache may be serious, and that not many ordinary people run it on their computer, it still bugs me. I don't like it when I have to do this sort of manual intervention in update procedure.

So what is the problem here? Daniel Robbins created a Gentoo moto once: The goal of Gentoo is to design tools and systems that allow a user to do his or her work as pleasantly and efficiently as possible, as they see fit....If the tool forces the user to do things a particular way, then the tool is working against, rather than for, the user. (cited from Gentoo Philosophy)

The problem is that I spent too much time caring for the computer with Gentoo. I don't have that luxury anymore. There was time, when geeking with the machine and fixing problems was cool. Today, its a burden. I value time, and I only have 24 hours a day of it.

I believe that this may be one of the general problems with Gentoo. When it began, most folks using Linux were techies, who cared about all the bits on their computers. Gentoo fit very well in this community, so it flourished and became very popular. It provided tools that noone had (and used to compile anything manually anyway), and community of a good will and lot of friendship. It had the best documentation (and maybe still do) among brothers, and best team of engineers.

But nowadays, many users want word processor, web browser, email program and video player. They want it now, and not wait 20 minutes when compilation will finish. They don't care about technicalities. And as Gentoo haven't changed its nature, it doesn't fit for majority anymore. Sabayon anyone?

Gentoo distro has proven over the years, that it will stay the way it is. And that's why it won't be back on my desktop soon.

So, Gentoo, stay on server.

Ubuntu, CentOS - my desktop is waiting.

April 23, 2008 02:19 PM :: Israel  

April 22, 2008

Clete Blackwell

On the Ignorance of People

The USA’s 2008 election is the first governmental election of any kind that I will have an opportunity to vote in. This is a very important election for everyone. We face major decisions about the war in Iraq and on many major issues. A lot is at stake in this year’s general election. I have followed this election more than any other. I have been involved in a lot of conversations with people who come from very different cultures and backgrounds. Many of these people support my viewpoints, many of them disagree, and many of them could not care less about politics.

I have been appalled at how little it takes for a certain candidate to obtain an undecided and unopinionated person’s vote. Those of us who are strong conservatives or liberals will not be swayed to vote the opposite way for anything in the world (although this is debatable). However, undecided people are swayed way too easily. I have heard at least ten people tell me that they will vote for Obama because he is “young” and relates to “the younger generation” more than any of the “old” people such as Clinton and McCain. I have had one person tell me that, although they are a liberal, they would not vote for McCain if they were conservative because he is “old” and that he will probably “die in office.”

People disregard the candidate’s actual views and opinions and focus in on the unimportant things. They decide to vote for candidates because they “seem like nice people.” I am not sure if it is the college students’ way or if the entire population is this way, but it’s atrocious. If I wanted to, I could run for President, take all of the “correct” views on all of the issues; that is, the ones that would get me elected. I could completely disagree with all of these views, but I could stand up there, smile, act friendly, cry when I hear a sad story, etc. I could be a complete phony and get some of these idiots to follow me just because I am a down-to-earth and likable character.

What has our country come to? Elect someone because they are honest and they agree with your issues. Elect them because they will lead our country in the right (or should I say correct?) direction. But do NOT elect them because they are nice people and they seem friendly. This is ridiculous. Does anyone want someone in office who is nice but leads our country to demise? I don’t think so.

Amendment: It seems that my post has sparked a large debate over at the Gentoo Forums. Check it out here. It looks like most people agree with my basic argument that people are ignorant. However, many people seem to have some insane (read: extreme socialistic / communistic) ideas on the Gentoo Forums.

April 22, 2008 10:39 PM

Bryan Østergard

Good news, everyone!

I just had to use that phrase :)

So.. This news isn't actually for everybody - I suspect it'll only be interesting for a small part of my readers. And it's not so much news as recounting some experiences I've had lately with Gentoo Forums and some small bits of advice that I can offer based on that.

A few weeks ago some user claimed in a largish forums thread that genone (Marius Mauch) had absolutely no idea how portage worked and that he should let other people who did take over his job as portage developer. This is fairly silly as Marius have been working on portage for at least 4 years now and he in turn asked who should take over his job. A well-known paludis user suggested that the paludis developers should take over and this post was quickly reported as trolling.

I saw this as clearly being a joke (nobody knowing the portage and paludis teams would ever seriously suggest the paludis developers should work on portage or vise versa) and stated so on the complaint thread. That was my first clear offence on gentoo forums and I got a warning for replying to the complaint thread.

It should probably be noted that I help run a fairly big irc chat network called freenode and that we actively encourage users to help explain simple misunderstandings and smooth out/resolve possible conflicts. Freenode calls this catalysing and I thought I was doing everybody a favour by explaining how it was a joke.

My second offence occured today when I abused the 'report' function. Some user had suggested that the report function be changed in such a way that not just the post being reported but every post by the user would be marked as reported. This looked to me like a seriously bad idea that would make it much easier to harass other users - 'report' a random post (even if 2 years old) and every post by that user would now look like they'd been reported. In fact, I thought this idea was bad enough that it needed the forums admins/moderators attention so they could reject it as suggested and hopefully have a discussion about the purpose of that idea - and maybe implement a different meassure that would fit the purpose without leaving such potential for abuse.

And as one forums admin/moderator told me later: rules are rules. And as this was my second offence I got banned from forums.

Now, I'm perfectly happy that they banned me. As a forums user I was a guest in their house and have to respect their rules. And even though I can't say I understand how my behaviour have been bad it's clearly unwanted and I fully support the forums admins hard work to get rid of troublemakers. I even went as far as requesting to be banned after my first offence as I was convinced that I'd behave bad again in my folly, not understanding the rules.

To sum things up: I completely support my ban as I've somehow behaved badly on several occasions and I'm quite happy to see that no favours was done just because of my status as a retired gentoo developer or similar. Keep up the good work and ban people that keep breaking the rules as I did.

And finally a few bits of advice as promised in the beginning.
1. Never, ever try to explain misunderstandings when a post is reported and if you really need to explain how it was all just a misunderstanding create a completely new thread with no link at all to the post containing the misunderstanding. It's much less confusing for forums admins if you create a new thread rather than replying to the thread in question (I was told this by one of the forums admins/moderators so I trust this advice to be accurate and good).
2. Always consult a forums admin/moderator off forums before using the report function. This way you know for sure if the situation calls for using that feature or it will be an act of abuse if you do so.

If you follow those two simple advises I hope you'll be able to stay out from trouble and will be able to enjoy using the gentoo forums for years to come.

April 22, 2008 09:54 PM

Nirbheek Chauhan

Wheeeeeeeeeee

So today was the day.
An insane night, on an insane channel.

So we were promised Cake.
Which got a bit delayed,
but the end we got a plate
Which was truly worth the wait

Translation: I've gotten accepted into GSoC, and the community bonding period has begun!

A couple of people I know got accepted as well -- Satya, Ramnik, and Siddarth. This will be a fun summer *grin*.

I was going through the abstracts of the accepted applications in orgs that interest me, and I found the following to be *very* interesting (in no specific order):

April 22, 2008 01:30 AM :: Uttar Pradesh, India  

April 21, 2008

Jason Jones

Sun shine

Well today has been ok. I've spent most of it playing with my brothers, but later on in the night I came onto the computer to find something I never expected! I found that my new friend CMT_Music_Awards she said things I'll never forget, made me cry.

I was suppost when I read what she said. I think she said she's only 11, she seems to be such a stong girl too! I ain't too much older than her now, and I'm no where near that strong. 
She may be the answer to my prayers, for months I've prayed that some one would come along who could and would help me. And then Puff I read a blog that I had to coment on. I wanted to help her so bad, now it seems like I got a new friend!

I hope that EVERY THING she's ever wanted or will ever want will come to her, I wish her so all the happieness in the world, and more!

Thank you CMT. Thank you.

Have a great night, and you too keep tomorrow in mind. I sure will :-)
<3

April 21, 2008 09:29 PM :: Utah, USA  

Martin Matusiak

clocking jruby1.1

Did you hear the exciting news? JRuby 1.1 is out! For real, you can call your grandma with the great news. :party: Wow, that was quick.

Okay, so the big new thing in JRuby is a bytecode compiler. As you may know, up to 1.0 it was just a Ruby interpreter in Java. Now you can actually compile Ruby modules to Java classes and no one will know the difference, very devious. :cool: Sounds like Robin Hood in a way, doesn’t it?

The JRuby guys are claiming that this makes JRuby on par with “regular Ruby” on performance, if not better. Hmm. Just to be on the safe side, what size shoes do you wear? Oh ouch, those are going to be tricky to fit in your mouth. :/ And Freud will say you’re stuck in the oral stage. Too much? Okay.

So here is my completely unvetted, dirty, real world test. No laboratory conditions here, you’re in the ghetto. First we need something *to* test. I don’t have a great deal of Ruby code at my disposal, but this should do the trick. How does scanning the raw filesystem for urls sound? The old harvest script actually does a half decent job of turning up a bunch of findings.

Now introducing the contenders. First up, his name is JRuby, you know him from occasional mentions on obscure blogs and the programming reddit past the top 500 entries. He promises to free all Java slaves by giving away free Rubies to everyone!

Aaand the incumbent, the famous… Ruby! You know him, your parents know him, every family would adopt him as their own child if they could. He’s the destroyer of kingdoms and the creator of empires, he’s bigger than Moses himself!

Our two drivers will be racing across a hostile territory. Your track is a 25gb ext3 live file system. During this time, I can promise you that only Firefox is likely to be writing new urls to disk, but I could be lying eheheh. Due to the unpredictable nature of this rally track, regulations allow only one racer at a time, but you will be clocked.

First up is the new kid on the block Jay….Ruby. The Ruby code will not be compiled before execution, we’ll let the just-in-time compiler do its thing.

$ time ( sudo cat /dev/sda5 | bin/jruby harvest.rb –url > /tmp/fsurls.jruby )
real 39m26.547s
user 37m19.072s
sys 1m28.406s

Not too shabby for a first run, but since this a brand new venue, we have no frame of reference yet. Let’s see how Ruby will do here.

$ time ( sudo cat /dev/sda5 | harvest.rb –url > /tmp/fsurls.ruby )
real 78m42.186s
user 62m12.537s
sys 2m18.721s

Well, look at that! The new kid is pretty slick, isn’t he? Sure is giving the old man a run for his money. Let’s see how they answered the questions.

$ lh
-rw-r–r– 1 alex alex 86M 2008-04-21 18:29 fsurls.jruby
-rw-r–r– 1 alex alex 8.6G 2008-04-21 20:58 fsurls.ruby

Yowza! No less than a hundred times more matches with Ruby. What is going on here? Did Jay just race to the finish li