Luca's meaningless thoughts  

Empanada surrealista

by Leandro Lucarella on 2009- 12- 30 20:55 (updated on 2009- 12- 30 20:55)
tagged es, humor, rant, surreal - with 3 comment(s)

Acabo de llamar a un lugar para pedir empanadas. El pedido incluía 14 empanadas de carne pero cuando le digo me responde horrorizado:

Nooooooooooo, papi! No te puedo vender tantas empanadas. Como mucho 12.

Indagando el por qué no, me dice que porque si no no va a poder responder a los demás pedidos. A ver, flaco... Primero, cuál es el problema de no poder responder a los demás pedidos si ya vendiste todo? O hacés más empanadas y ganás más plata, o te quedás así y ganás lo mismo, pero en menos tiempo/pedidos. Y por si eso fuera poco: son 2 fucking empanadas de diferencia!!!, no es que te pedí 5 docenas...

No hubo caso, no me quisieron vender las 14, tuve que pedir 12 de carne y dos más de otro gusto (siempre sin pasarse de las 12, por supuesto!).

En fin, brillante "El Popeye" de Coghlan.

Adiós Mazziblog

by Leandro Lucarella on 2009- 12- 23 21:36 (updated on 2009- 12- 23 21:36)
tagged blog, es, mazziblog, self - with 3 comment(s)

En estos días se nos vence el hosting de Mazziblog y decidimos no renovarlo, por lo que pronto va a ser un link roto :(.

La verdad que me da mucha pena pero estamos poniendo las energías en otros lares (yo acá, Mazzi en su Tumblr y su Flickr) y la verdad no se sostiene. Igual creo que Mazziblog fue un gran proyecto y quiero agradecer públicamente a mi amigo personal que me persiguió para que me cope y me sume.

Gracias Mazzi! Y gracias a todos los que pasaron efímeramente por el blog también (Jonas, Jalo y Alby's :).

Ahora estoy haciendo un backup y tal vez en el futuro pueda colgar el blog de algún lado, aunque sea con fines históricos (de todas formas siempre está nuestra amiga, la WaybackMachine para recordarnos como puede).

Bueno, un golpe duro para empezar un año nuevo, pero igual el recuerdo de Mazziblog siempre será bueno, y creo que ha cumplido su objetivo, considerando que el primer post vaticinaba un futuro estadísticamente negro:

También dicen que el 40% no sobreviven el primer mes, y el 20% el primer año. Esperemos no incluirnos en esta parte de las estadísticas.

Fueron casi 5 años a puro éxito =P

Adiós Mazziblog!!! Que la fuerza te acompañe...

D.NET is looking for developers

by Leandro Lucarella on 2009- 12- 20 21:39 (updated on 2009- 12- 20 21:44)
tagged .net, d, d.net, dnet, en, mono - with 0 comment(s)

The D.NET project is looking for developers. Here is a small quote from the latest e-mail from Tim Matthews:

It is in a very alpha like state and this is just a callout for developers to work on this compiler. Not for anyone intending to immediately target the CLR with D.

D.NET is targeting D2 only for now and can access only to the .NET standard library (you can't use Phobos).

Update

It looks like Time Matthews is now hosting the project here.

Grog XD, epic fail make it into the latest Monkey Island game

by Leandro Lucarella on 2009- 12- 15 12:56 (updated on 2009- 12- 15 12:56)
tagged en, fail, game, grog xd, humor, monkey island, video - with 0 comment(s)

I don't know if you knew about this huge epic fail by one of the most fascist TV news channel in Argentina (CN5), if you don't, first take a look at this video:

Here is a video with the original description of Grog in The Secret of Monkey Island:

Well, it turn out the new beverage Grog XD was included in the new Monkey Island game: Tales of Monkey Island. Very funny indeed. XD [*].

https://www.llucax.com.ar:443/blog/posts/2009/12/15-grog1.mini.jpg https://www.llucax.com.ar:443/blog/posts/2009/12/15-grog2.mini.jpg https://www.llucax.com.ar:443/blog/posts/2009/12/15-grog3.mini.jpg

Via Noticias de Ayer.

[*]This is a smiley, not a textual X followed by a D (just in case the CN5 people is reading...)

Monty Python: Almost the Truth

by Leandro Lucarella on 2009- 12- 13 17:04 (updated on 2009- 12- 13 17:04)
tagged almost the truth, es, isat, monty python, tv - with 0 comment(s)

ISAT lo hizo de nuevo. A partir de hoy va a pasar una serie documental sobre Monty Python: Almost the Truth.

No me voy a calentar en hacer comentarios al respecto porque ya alguien lo hizo, así que pueden ir a leerlo ahí directamente. Solo voy a citar una frase que me pareció muy cierta:

Alguien dijo alguna vez que los Monty Python fueron al humor lo que Los Beatles a la música.

In-Edit: Keep On Running: 50 Years Of Island Records

by Leandro Lucarella on 2009- 12- 13 15:37 (updated on 2009- 12- 13 15:37)
tagged documental, es, in-edit, island records, keep on running, movie, music - with 0 comment(s)

Este documental es un imperdible. La verdad que no tenía idea de lo groso que es (bah, fue casi que diría) el sello y lo fundamental que fue para la historia de la música.

Sin dudas el paisaje musical en la actualidad no sería ni remotamente parecido a lo que es de no haber existido. ¿Sabían por ejemplo que este sello fue el que difundió el Raggae y el Ska? Esto fue debido a que Chris Blackwell, su creador, vivió su infancia en Jamaica mamando su musica y tuvo la visión y el objetivo a largo plazo de difundirlo, desde sus comienzos a principios de los 60' hasta el descubrimiento de este pibito jamaiquino llamado Roberto, a quien guió y apadrinó hasta la explosión del género a mediados de los 70'.

Una constante en Island Records (y Chris) parece que fue el pensamiento a largo plazo y la libertad y flexibilidad dada a los artistas (preservando su integridad artística).

Recuerdo, por ejemplo, que los discos de U2 que compraba (fue una de las primeras bandas de las cuales compré discos) tenías ese característico logo del sello. Para mí U2 se fue al carajo hace rato ya y de integridad artística le queda poco y nada, y ahora atando cabos veo que su último disco creativo (según la opinión de este humilde servidor) de la banda (Pop) fue también el último disco que sacaron con Island. ¿Casualidad?

En fin, esto es solo un 1% de lo que muestra la película. Lamentablemente no quedan más funciones, pero si tienen oportunidad de verla, no lo duden un segundo, lo malo es que se van a dar cuenta de lo poco que saben de la historia de la música, lo bueno es que van a saber un poquito más ;)

In-Edit: Madness: The Liberty Of Norton Folgate

by Leandro Lucarella on 2009- 12- 13 15:16 (updated on 2009- 12- 13 15:16)
tagged es, in-edit, live, madness, movie, music, the liberty of norton folgate - with 0 comment(s)

Por alguna razón (seguramente que no sabía el nombre de su último álbum :) me había hecho la idea de que iba a ser un documental sobre su época de oro, pero fue más bien un show en vivo de su último trabajo. Igual estuvo bueno, el show fue en un teatro antiguo, con mucha producción, un aire circense, mucha alegría, gente disfrazada y hasta ciertos momentos teatrales.

Con mucho contenido social en sus letras y una atmósfera casi mística, hacían referencia a épocas pasadas (alrededor del 1900) en el área de Norton Folgate.

Ninguna locura, pero muestra la calidad de una banda fundamental dando un excelente show y está muy bien para entretenerse un rato (si les gusta la banda, si no probablemente los aburra un poco).

In-Edit: Peligrosos Gorriones

by Leandro Lucarella on 2009- 12- 13 15:07 (updated on 2009- 12- 13 17:06)
tagged el teatro, es, in-edit, live, music, peligrosos gorriones - with 0 comment(s)

El show estaba anunciado para las 23hs pero empezó casi a la 1. Dos horas de putear bastante dado que era un jueves, pero cuando empezó el show ya nada importó. ¡¡¡Qué banda, dios mío!!! Es la primera vez que los veo en vivo porque, lamentablemente, no empecé a escucharlos en serio hasta después de su separación en 1999.

Si bien ya tenía muy clara la genialidad de esta banda, mi admiración tomó una nueva dimensión gracias a este show en vivo. La calidad de los músicos, la forma en la que se divierten en el escenario y la potencia y energía que transmiten es algo que no se percibe en un disco. Nunca vi a Bochatón (el único de los ex-Gorriones que fui a ver varias veces) tan feliz y jodón, no sé si se había fumado algo o si con los Gorriones siempre fue así, pero como solista, si bien no es para nada un amargo, no transmite ni la mitad de lo que transmitió esa noche.

¿El repertorio? Completísimo. Casi una hora y media de show (considerando que sus 3 discos sumados duran dos horas y diez minutos se pueden dar una idea de la cobertura :) No me acuerdo el orden de los temas, pero creo que más o menos me acuerdo todos los temas que tocaron, así que les doy una lista disco por disco (SEUO):

Peligrosos Gorriones (1993)
  • Escafandra (casi seguro que fue el tema de apertura, a lo sumo el segundo)
  • Trampa
  • Tesoro
  • El bicho reactor (tema de cierre)
  • Rayo de amor
  • Panza de araña
  • Siempre acampa
  • Un ardiente beso
  • La mordida
  • Nuestros días (casi se me pianta un lagrimón, que linda canción :)
  • Estos pies
  • Honda congoja y pesar
  • Cachavacha

Las faltantes: Cacería de caballos, Juegue ud. y Adentro.

Fuga (1995)
  • El mimo
  • Continuo susto
  • La procesión (me sorprendió mucho como hacía el efecto de eco de la voz si ningún tipo aparato, simplemente... bueno, con su voz)
  • Serpentina
  • Manicomio gris
  • Agua acróbata
  • Sé que el tiempo
  • Sacacorcho
  • Penumbra
  • Mañanitas
  • Amo el jardín

Las faltantes: Las voces del viento, Después de todo y Baila Valses.

Antiflash (1997)

El más mutilado:

  • Mi propio brujo
  • Desde que te fuiste
  • Corre
  • Por tres monedas

Las faltantes: Me extingo, El sol de jaf (recién caigo que creo que no tocaron estas dos!, a menos que se me haya borrado de la memoria :S), Macanas, Villancicos, Blanda y plácida, Viento castelar, Muchachita, Proyector de cine, Salvaje, Una dosis y Jugar con armas.

Seguramente a algo le erré, así que cualquier corrección que tengan para hacer será bienvenida.

Update

Les dejo una nota sobre la reunión de la banda, que deja las puertas abiertas a la posibilidad de un nuevos disco. Enjoy!

Crónicas del In-Edit 2009

by Leandro Lucarella on 2009- 12- 13 15:06 (updated on 2009- 12- 13 15:06)
tagged buenos aires, documental, es, festival, in-edit, movie, music - with 0 comment(s)

Como ya he comentado del jueves 10 al lunes 14 se realiza el 2do Festival de Cine Documental y Musical de Buenos Aires (AKA In-Edit Cinzano).

Pude ir a la fiesta de apertura el jueves en la que se presentararon los ex-ex-Peligrosos Gorriones y si bien no pude ver todas las películas que tenía planeadas, también pude ver varias de ellas. En los próximos posts voy a ir contando las cosas que vi. Stay tuned!

2do Festival de Cine Documental y Musical de Buenos Aires

by Leandro Lucarella on 2009- 12- 08 13:28 (updated on 2009- 12- 08 16:08)
tagged buenos aires, documental, es, festival, in-edit, movie, music - with 0 comment(s)

Del jueves 10 (pasado mañana) al lunes 14 se realizará el 2do Festival de Cine Documental y Musical de Buenos Aires (AKA In-Edit Cinzano). La programación se ve interesante, aunque solo les puedo recomendar Style Wars, documental sobre la historia del Grafitti en Nueva York (que vi con mi amigo personal mazzi, justo el fin de semana sin saber que la iban a proyectar acá). Es muy loco que eran muy niños los que empezaron, en el documental el grueso de la gente estará entre los 12 y 18 años.

Ah! Para tener en cuanta: las entradas de películas internacionales salen $12 y las nacionales son gratis, sólo hay que pagar un impuesto del 10% (es decir, $1.2). No voy a emitir opinión sobre lo que me parece cobrar un impuesto para difundir cine nacional con muy poco espacio, incluso cuando los organizadores no cobran entrada...

Además hay una fiesta de apertura el jueves en la que se van a presentar (por tercera vez en el año si no me equivoco) los ex-ex-Peligrosos Gorriones.

Un plan interesante mientras esperamos a Papá Noel =P

Update

Hice un poco de research así que les dejo una lista de las películas que son más interesantes para mí:

  • Who Killed Nancy:

    Se reabre el caso Sid Vicious - Nancy Spungen. El film expone que el fallecido bajista de los Sex Pistols era un montón de cosas feas, pero no el asesino de su novia. Una película que es tanto crónica punk-rock, como adictivo documental de conspiranoia.

  • R.E.M.: This Is Not A Show:

    Julio del 2007, cinco noches encerrados en el Olimpia Theatre de Dublín. Una audiencia de familia, amigos y fans íntimos. Canciones muy nuevas o muy antiguas. Una semana de pruebas con unos R.E.M. inauditos, filmada con gran belleza en blanco y negro, casi palpable.

  • Madness: The Liberty Of Norton Folgate:

    Un espectáculo del 2008 que es un canto de amor a Londres. Norton Folgate (una zona de la ciudad independiente hasta 1900) sirvió de trampolín a los magos del ska-pop para edificar una historia llena de truhanes, pillos victorianos y vividores. Circo, teatro, pop, sublime y enfrente. La cámara viva de Julien Temple.

  • Bananaz:

    Damon Albarn (de Blur) y el dibujante Jamie Hewlett (Tank Girl) se unieron para crear Gorillaz.

    [...]

    Bananaz hace [...] que podamos ver lo que se esconde detrás: el trabajo de grabación -con De La Soul, Ibrahim Ferrer, un Shaun Ryder incapaz de recordar dos palabras- los directos, el ensamblaje animado con el grupo tras la pantalla, las discusiones Hewlett-Albarn, las payasadas y las giras y los problemas. Si esto son los Monkees de la era 2.0, queremos más. Brutalez.

Las películas argentinas dejan bastante que desear, solo hay 2 interesantes:

  • Kapanga Todoterreno:

    El Mono, Maikel, Balde, Maffia, Mariano y Memo son seis albañiles que deciden presentarse en un concurso de bandas beat para cumplir su sueño: dedicarse a la música y el asado.

    Camino al concurso, se verán atrapados en las más desopilantes aventuras: luchando junto a Conan contra un dragón gigante; enfrentando al Demonio Asadero en un absurdo exorcismo; salvando a Araceli de las manos de terroristas en una aventura al mejor estilo James Bond; y hasta se verán atrapados en el laberinto de las alcantarillas, del que sólo podrán salir contestando los acertijos del fabuloso Chorizo Cantor.

    Más allá de que les guste Kapanga o no, está hecha por Farsa Producciones, lo que seguramente haga que valga la pena verla aunque sea por lo delirante y bizarra :)

XKCD and Michael and Me

by Leandro Lucarella on 2009- 12- 04 21:07 (updated on 2009- 12- 04 21:07)
tagged es, michael moore, movie, política, roger & me, video, xkcd, youtube - with 0 comment(s)

El último chiste de XKCD hace referencia a la primera película de Michael Moore (Roger & Me), por lo que me hizo recordarla y pensar: ¿Por qué no recomendarla?

¿Eh? ¿Por qué no? Se las recomiendo.

Luego habrá que ver la nueva, que habla de la crisis financiera de 2007-2009. Si es la mitad de buena que este video ya vale la pena ;)

LDC uploaded to Debian

by Leandro Lucarella on 2009- 12- 03 16:57 (updated on 2009- 12- 03 16:57)
tagged d, debian, en, ldc - with 0 comment(s)

Finally, Debian's bug #508070 is closed! That means that LDC is officially in Debian now. The package is only in the experimental repositories for now, I hope it hits testing soon.

Thanks to Arthur Loiret for the packaging efforts!

bpython

by Leandro Lucarella on 2009- 12- 03 11:56 (updated on 2009- 12- 03 11:56)
tagged bpython, curses, en, floss, interpreter, python, software - with 0 comment(s)

I'll just copy what the home page:

bpython is a fancy interface to the Python interpreter for Unix-like operating systems (I hear it works fine on OS X). It is released under the MIT License. It has the following features:

  • In-line syntax highlighting.
  • Readline-like autocomplete with suggestions displayed as you type.
  • Expected parameter list for any Python function.
  • "Rewind" function to pop the last line of code from memory and re-evaluate.
  • Send the code you've entered off to a pastebin.
  • Save the code you've entered to a file.
  • Auto-indentation.
https://www.llucax.com.ar:443/blog/posts/2009/12/03-bpython.png

Grooveshark

by Leandro Lucarella on 2009- 12- 02 23:40 (updated on 2009- 12- 02 23:40)
tagged en, flash, grooveshark, music, online, streaming - with 0 comment(s)

Grooveshark is a nice web 2.0 site that let you listen to music online. The difference with other similar sites is you can search for an album, artist or song, and fully listen to what you found (full songs, full albums), and they have a pretty large collection.

Unfortunately Flash is not dead yet, so you need that crappy, smelly plug-in to access the site.

Improved string imports

by Leandro Lucarella on 2009- 12- 01 19:38 (updated on 2009- 12- 01 19:38)
tagged d, en, import, patch, string import - with 0 comment(s)

D has a very nice capability of string imports. A string import let you read a file at compile time as a string, for example:

pragma(msg, import("hello.txt"));

Will print the contents of the file hello.txt when it's compiled, or it will fail to compile if hello.txt is not readable or the -J option is not used. The -J option is needed because of security reasons, otherwise compiling a program could end up reading any file in your filesystem (storing it in the binary and possibly violating your privacy). For example you could compile a program as root and run it as an unprivileged user thinking it can't possibly read some protected data, but that data could be read at compile-time, with root privileges.

Anyway, D ask you to use the -J option if you are doing string imports, which seems reasonable. What doesn't look so reasonable is that string imports can't access a file in a subdirectory. Let's say we have a file test.d in the current directory like this:

immutable s = import("data/hello.txt");

And in the current directory we have a subdirectory called data and a file hello.txt in it. This won't compile, ever (no matter what -J option you use). I think this is an unnecessary limitation, using -J. should work. I can see why this was done like that, what if you write:

immutable s = import("../hello.txt");

It looks like this shouldn't work, so we can ban .. from string imports, but what about this:

immutable s = import("data/../data/hello.txt");

This should work, it's a little convoluted but it should work. And what about symbolic links?

Well, I think this limitation can be relaxed (other people think that too, there is even a bug report for this), at least on POSIX-compatible OSs, because we can use the realpath() function to resolve the file. If you resolve both the -J directories and the resulting files, it's very easy to check if the string import file really belongs to a -J subdirectory or not.

This looks very trivial to implement, so I gave it a shot and posted a patch and a couple of test cases to that very same bug report :)

The patch is incomplete, though, because it's only tested on Linux and it lacks Windows support (I don't know how to do this on Windows and don't have an environment to test it). If you like this feature and you know Windows, please complete the patch, so it has better chances to make it in D2, you only have to implement the canonicalName() function. If you have other supported POSIX OS, please test the patch and report any problems.

Thanks!

Die Flash, die!!!

by Leandro Lucarella on 2009- 12- 01 19:10 (updated on 2009- 12- 01 19:10)
tagged en, flash, html, web - with 2 comment(s)

I hope HTML5 eats Adobe Flash alive and spits his bones, because it sucks so hard it makes you hurt.

Fortunately it seems that Google is planning on using it, which is nice because things adopted by the big G usually live long and well and are usually adopted by a lot of people. For example there is an experimental version of YouTube that doesn't use Flash, only HTML5. It only works with WebKit rendering engine for now (I tested it with Midori and it worked, with a few quirks but worked :).

opDispatch

by Leandro Lucarella on 2009- 11- 30 02:02 (updated on 2009- 11- 30 02:02)
tagged d, dynamic, en, opdispatch, patch - with 0 comment(s)

From time to time, people suggested features to make easier to add some dynamic capabilities to D. One of the suggestions was adding a way to have dynamic members. This is specially useful for things like ORMs or RPCs, so you can do something like:

auto rpc = new RPC;
rpc.foo(5);

And it get automatically translated to some sort of SQL query or RPC call, using some kind of introspection at runtime. To enable this, you can translate the former to something like:

obj.dispatch!("foo")(5);

There was even a patch for this feature, but Walter didn't payed much attention and ignore this feature until a couple of days ago, when he got bored and implement it himself, on its own way =P

I think this is a very bad policy, because it discourages people to contribute code. There is no much difference between suggesting a feature and implementing it providing a patch, unless you have a very good personal relationship with Walter. You almost never will have feedback on your patch, Walter prefers to implement things himself instead of giving you feedback. This way it's very hard for people wanting to contribute to learn about the code and on how Walter wants patches to be done; and this is what discourages contributions.

I won't write again about what are the problems in the D development model, I already done that without much success (except for Andrei, who is writing better commit messages now, thanks for that! =). I just wanted to point out another thing that Walter don't get about open-source projects.

Anyway, this post is about opDispatch(), the new way of doing dynamic dispatching. Walter proposed opDynamic(), which was wrong, because it's not really dynamic, it's completely static, but it enables dynamic dispatching with a little extra work. Fortunately Michel Fortin suggested opDispatch() which is a better name.

The thing is simple, if a method m() is not found, a call to opDispatch!("m")() is tried. Since this is a template call, its a compile-time feature, but you can easily do a dynamic lookup like this:

void opDispatch(string name)(int x)
{
    this.dispatch(name, x);
}

void dispatch(string name, int x)
{
    // dynamic lookup
}

I personally like this feature, we'll see how all this turns out.

Reciclable

by Leandro Lucarella on 2009- 11- 21 18:00 (updated on 2009- 11- 21 18:00)
tagged es, imán, photo, reciclable - with 2 comment(s)

Más bizarreadas del mundo real:

https://www.llucax.com.ar:443/blog/posts/2009/11/21-reciclable.jpg

¿Alguien me explica que tiene que ver el imán?

Update

Parece ser el símbolo de Acero Reciclable (al menos en inglaterra). Más detalle en los comentarios...

Unintentional fall-through in D's switch statements

by Leandro Lucarella on 2009- 11- 20 23:34 (updated on 2009- 11- 20 23:34)
tagged d, en, fall-through, patch, switch - with 2 comment(s)

Removing switch fall-through from D's switch statement is something discussed since the early beginnings of D, there are discussions about it since 2001 and to the date [*]. If you don't know what I'm talking about, see this example:

switch (x) {
case A:
    i = x;
    // fall-through
case B:
    j = 2;
    break;
case C:
    i = x + 1;
    break;
}

If you read carefully the case B case A code, it doesn't include a break statement, so if x == A not only i = x will be executed, the code in case B will be executed too. This is perfectly valid code, introduced by C, but it tends to be very error prone and if you forget a break statement, the introduced bug can be very hard to track.

Fall-through if fairly rare, and it would make perfect sense to make it explicit. Several suggestions were made in this time to make fall-through explicit, but nothing materialized yet. Here are the most frequently suggested solutions:

  • Add a new syntax for non-fall-through switch statements, for example:

    switch (x) {
    case A {
        i = x;
    }
    case B {
        j = 2;
    }
    case C {
        i = x + 1;
    }
    
  • Don't fall-through by default, use an explicit statement to ask for fall-through, for example:

    switch (x) {
    case A:
        i = x;
        goto case;
    case B:
        j = 2;
        break;
    case C:
        i = x + 1;
        break;
    }
    

    Others suggested continue switch or fallthrough, but I think some of this suggestions were made before goto case was implemented.

A few minutes ago, Chad Joan has filled a bug with this issue, but with a patch attached 8-). He opted for an intermediate solution, more in the lines of new switch syntax. He defines 2 case statements: case X: and case X!: (note the !). The former doesn't allow implicit fall-through and the latter does. This is the example in the bug report:

switch (i)
{
    case 1!: // Intent to use fall through behavior.
        x = 3;
    case 2!: // It's OK to decide to not actually fall through.
        x = 4;
        break;

    case 3,4,5:
        x = 5;
        break;

    case 6: // Error: You either forgot a break; or need to use !: instead of :
    case 7: // Fine, ends with goto case.
        goto case 1;

    case 8:
        break;
        x = 6; // Error: break; must be the last statement for case 8.
}

While I really think the best solution is to just make a goto case required if you want to fall-through [†], it's great to have a patch for a solution. Thanks Chad! =)

[*]

This is the latest discussion about this, started by Chad Joan (I guess): http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=101110

Here is the last minute announcement of the patch: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=101937

And here are some links for older switch statement related discussions:

[†]I find it more readable and with better locality, to know if something fall-through or not I just have to read the code sequentially without remembering which kind of case I'm in. And I think cases without any statements should be allowed too, I wonder how this works with case range statements.

Gato Pajero

by Leandro Lucarella on 2009- 11- 20 16:07 (updated on 2009- 11- 20 16:07)
tagged ciencias naturales, es, fcnym, gato, gato pajero, la plata, museo, photo - with 0 comment(s)

https://www.llucax.com.ar:443/blog/posts/2009/11/20-gato-pajero.mini.jpg

Sacada en el Museo de Ciencias Naturales de la Facultad de Ciencias Naturales y Museo de la Universidad Nacional de La Plata.

Más allá del chiste, muy lindo museo para ir a visitar...

Primer

by Leandro Lucarella on 2009- 11- 19 23:31 (updated on 2009- 11- 19 23:31)
tagged es, movie, primer, shane carruth, time travel, xkcd - with 0 comment(s)

Hace poquito vi Primer, película a la que llegué gracias a un cómic de XKCD. A mí me gusto bastante, es del tipo de película de Pi, en la que no entendés demasiado a menos que la veas un par de veces o que leas el artículo de Wikipedia =P.

Como un dato curioso, la película fue escrita, dirigida, producida, editada, protagonizada y musicalizada por Shane Carruth, un matemático, y costó sólo U$D 7.000. Además los actores terminaron siendo todos amigos o familiares porque dijo que los actores que audicionaron le ponían todos demasiado drama a la actuación =P.

Su idea fue mostrar un descubrimiento científico (en este una forma de viaje en el tiempo) de forma realista; es decir, de casualidad como efecto secundario de otro experimento. No digo más para no cagarles mucho la película (asumo que si leyeron XKCD ya se imaginaban que había viaje en el tiempo ;).

Para los que ya la vieron, en el artículo de la Wikipedia hay un lindo diagrama de como funciona el viaje en el tiempo que les recomiendo mirar.

Qué linda facultad!

by Leandro Lucarella on 2009- 11- 19 12:26 (updated on 2009- 11- 19 12:26)
tagged es, facultad, fiuba, gradiente, política, uba, video, youtube - with 0 comment(s)

El viernes 13 de noviembre de 2009, en el Consejo Directivo de la FIUBA (Facultad de Ingeniería de la UBA, por si anda con fiaca de seguir links), un animalito llamado Edgardo Romano, coordinador del Área Inserción Laboral y Pasantías de la SEUBE (Secretaría de Extensión Universitaria y Bienestar Estudiantil) y miembro de la agrupación El Gradiente, golpeó sorpresivamente a un estudiante durante la sesión para elegir Decano. Lamentablemente (para él), quedó grabado en un video.

Lavar cuando esté sucia

by Leandro Lucarella on 2009- 11- 18 00:26 (updated on 2009- 11- 18 00:26)
tagged es, foto, humor - with 0 comment(s)

Brillante la sugerencia de la gente de Puma.

https://www.llucax.com.ar:443/blog/posts/2009/11/18-lavar-cuando-esté-sucia.mini.jpg

Por si no se entiende, esa es la parte de atrás de la etiqueta de una remera.

La fábrica de fallas

by Leandro Lucarella on 2009- 11- 17 10:30 (updated on 2009- 11- 17 10:30)
tagged cc, copyleft, es, evento, la tribu, política - with 0 comment(s)

https://www.llucax.com.ar:443/blog/posts/2009/11/17-la-fábrica-de-fallas.jpg

El sábado 21 y domingo 22 de noviembre se realizará el segundo festival de cultura libre y copyleft denominado Fábrica de Fallas, en FM La Tribu (Lambaré 873).

El programa es largo y tiene muchas actividades muy interesantes, les recomiendo pegarle una mirada. A diferencia de otros eventos de este estilo, parece mucho más abarcativo e interesante. Incluso a través de este evento me enteré que ya hay una pata argentina del Partido Pirata.

En fin, muy interesante como para darse una vuelta.

Annotations, properties and safety are coming to D

by Leandro Lucarella on 2009- 11- 13 23:34 (updated on 2009- 11- 14 15:45)
tagged annotation, d, en, property, safe, trusted - with 0 comment(s)

Two days ago, the documentation of functions were updated with some interesting revelations...

After a lot of discussion about properties needing improvements [*], it seems like they are now officially implemented using the annotations, which seems to be official too now, after some more discussion [†].

Annotations will be used for another longly discussed feature, a safe [‡] subset of the language. At first it was thought as some kind of separate language, activated through a flag -safe (which is already in the DMD compiler but has no effect AFAIK), but after the discussion it seems that it will be part of the main language, being able to mark parts of the code as safe or trusted (unmarked functions are unsafe).

Please take a look at the discussions for the details but here are some examples:

Annotations

They are prefixed with @ and are similar to attributes (not class attributes; static, final, private, etc. see the specs for details).

For example:

@ann1 {
    // things with annotation ann1
}

@ann2 something; // one thing with annotation ann2

@ann3:
    // From now on, everything has the ann3 annotation

For now, only the compiler can define annotations, but maybe in the future the user can do it too and access to them using some reflection. Time will tell, as usual, Walter is completely silent about this.

Properties

Properties are now marked with the @property annotation (maybe a shorter annotation would be better? Like @prop). Here is an example:

class Foo {
    @property int bar() { ... } // read-only
    @property { // read-write
        char baz() { ... }
        void baz(char x) { ...}
    }

Safe

Now functions can be marked with the annotations @safe or @trusted. Unmarked functions are unsafe. Safe functions can only use a subset of the language that it's safe by some definition (memory safe and no undefined behavior are probably the most accepted definition). Here is a list of things a safe function can't do:

  • No casting from a pointer type to any type other than void*.
  • No casting from any non-pointer type to a pointer type.
  • No modification of pointer values.
  • Cannot access unions that have pointers or references overlapping with other types.
  • Calling any unsafe functions.
  • No catching of exceptions that are not derived from class Exception.
  • No inline assembler.
  • No explicit casting of mutable objects to immutable.
  • No explicit casting of immutable objects to mutable.
  • No explicit casting of thread local objects to shared.
  • No explicit casting of shared objects to thread local.
  • No taking the address of a local variable or function parameter.
  • Cannot access __gshared variables.

There is some discussion about bound-checking being active in safe functions even when the -release compiler flag is used.

Trusted functions are not checked by the compiler, but trusted to be safe (should be manually verified by the writer of the function), and can use unsafe code and call unsafe functions.

[*]

Here are some links to the property discussions:

[†]

Discussion about annotations:

[‡]

Discussions about SafeD:

Fotopedia

by Leandro Lucarella on 2009- 11- 13 18:06 (updated on 2009- 11- 13 18:06)
tagged cc, en, fotopedia, photo - with 0 comment(s)

About Fotopedia:

Fotopedia is breathing new life into photos by building a photo encyclopedia that lets photographers and photo enthusiasts collaborate and enrich images to be useful for the whole world wide web.

It's like the Wikipedia but only about pictures. Pick a place, person, object, whatever, and get (Creative Commons licensed) pictures about it.

Burj Dubai

by Leandro Lucarella on 2009- 11- 13 16:02 (updated on 2009- 11- 13 16:02)
tagged building, burj dubai, en, flickr, joichi ito, photo, surreal - with 0 comment(s)

It's a drawing? ... It's a 3D render? ...

https://www.llucax.com.ar:443/blog/posts/2009/11/13-burj-dubai.jpg

It's a photograph! (by Joichi Ito)

The Burj Dubai is the tallest building in the world... by far (about 50% taller than the second). It has a projected shadow of 2.5 Km.

Go nuts

by Leandro Lucarella on 2009- 11- 11 12:14 (updated on 2009- 11- 11 18:48)
tagged compiler, d, en, go, google, language, software - with 11 comment(s)

I guess everybody (at least everybody with some interest in system programming languages) should know by now about the existence of Go, the new system programming language released yesterday by Google.

I think this has a huge impact in D, because it's trying to fill the same hole: a modern high-performance language that doesn't suck (hello C++!). They have a common goal too: be practical for business (they are designed to get things done and easy of implementation). But there are still very big differences about both languages. Here is a small summary (from my subjective point of view after reading some of the Go documentation):

Go D
Feels more like a high-level high- performance programming language than a real system programming language (no asm, no pointer arithmetics). Feels more like a real close to the metal system programming language.
Extremely simple, with just a very small set of core features. Much more complex, but very powerful and featureful.
Can call C code but can't be called from C code. Interacts very well with C in both directions (version 2 can partially interact with C++ too).
Feels like a very well thought, cohesive programming language. Feels as a bag of features that grew in the wild.
FLOSS reference implementation. Looks very FLOSS friendly, with proper code review, VCS, mailing lists, etc. Reference implementation is not FLOSS. Not very FLOSS friendly (it's just starting to open up a little but it's a slow and hard process).
Supported by a huge corporation, I expect a very large community in very short time. Supported by a very small group of volunteers, small community.

I really like the simplicity of Go, but I have my doubts about how limiting it could be in practice (it doesn't even have exceptions!). I have to try it to see if I will really miss the features of more complex programming languages (like templates / generics, exceptions, inheritance, etc.), or if it will just work.

I have the feeling that things will just work, and things missing in Go will not be a problem when doing actual work. Maybe it's because I had a very similar feeling about Python (indentation matters? Having to pass self explicitly to methods? No ++? No assignment in if, while, etc.? I hated all this things at first, but after understanding the rationale and using then in real work, it works great!). Or maybe is because there are is extremely capable people behind it, like Ken Thomson and Rob Pike (that's why you can see all sort of references to Plan 9 in Go :), people that knows about designing operating systems and languages, a good combination for designing a system programming language ;)

You never know with this things, Go could die in the dark or become a very popular programming language, only time will tell (but since Google is behind it, I guess the later is more likely).

The Mighty Boosh

by Leandro Lucarella on 2009- 11- 07 17:09 (updated on 2009- 11- 07 17:09)
tagged es, humor, isat, serie, series, the mighty boosh, tv, uk - with 0 comment(s)

Ya dije varias veces que no tengo tele (ni cable).

¿Por qué lo repito? Porque cada vez que tengo la oportunidad de sentarme frente a un televisor con cable y estoy haciendo zapping durante media hora sin encontrar nada interesante, pienso lo mismo: No pienso poner cable.

Pero claro, siempre hay una excepción a la regla y en este caso esa excepción se llama I-SAT (por alguna razón el Javascript de la página principal no me anda así que escarbando en el HTML encontré que se puede entrar usando este link alternativo que les dejo por si tampoco pueden entrar). En ese bendito canal siempre hay altas chances de encontrarse algo interesante.

Anoche no fue la excepción, y me encontré con una serie británica muy delirante llamada The Mighty Boosh. En realidad se trata de un grupo homónimo, que hizo de todo: obras de teatro, radio y TV. Humor británico 100%, así que sabrán a qué atenerse (a mí en particular me resulta muy divertido :-).

Por si no se dan cuenta en las fotos, uno de los miembros, Noel Fielding, es el actor que interpreta a Richmond, el personaje dark de The IT Crowd, otra gloriosa serie británica.

En fin, si bien vi apenas 15 minutos de un capítulo empezado, me animo a recomendarla enfáticamente. Pueden buscar videos en YouTube y ver que les parece. Igual yo no encontré nada muy representativo de la serie, en el canal oficial de la BBC hay muchos de los números musicales que hacen (que son un desquicie total) y en el canal oficial del DVD del grupo hay varias apariciones en vivo (también mayormente musicales).

DMD frontend 1.051 merged in LDC

by Leandro Lucarella on 2009- 11- 07 15:08 (updated on 2009- 11- 07 15:08)
tagged compiler, d, dmd, en, ldc, merge, software - with 0 comment(s)

After 5 or 6 DMD versions with important regressions, LDC has just been updated to DMD's frontend 1.051. This brings a lot of bug fixes to the LDC world (DStress results are looking good! ;).

Lots of thanks to LDC guy for merging the new frontend =)

TED: Witricity

by Leandro Lucarella on 2009- 11- 06 20:40 (updated on 2009- 11- 06 20:40)
tagged en, eric giler, ted, witricity - with 0 comment(s)

The concept of transfering energy without wires is pretty old. You can even have it now, with RFID for example (I even have a mouse that uses no battery, the pad transfer energy to the mouse using RFID; very good mouse BTW).

But Eric Giler presents a nice new concept in wireless electricity (the marketing name is Witricity), because other kind of wireless energy transfer I saw has very little power (to avoid frying your brain ;). This one works using a magnetic field instead of radio waves, which makes possible to transfer bigger amounts of energy without harm.

In the video you can see how it powers a big LCD screen for example. I don't know how efficient this will be. At first sight it looks like it would waste a lot of energy, because the magnetic field generation will be using energy all the time, even when there are no devices using it.

Here is the video:

Patch to make D's GC partially precise

by Leandro Lucarella on 2009- 11- 06 11:43 (updated on 2009- 11- 06 20:09)
tagged d, en, gc, precise - with 0 comment(s)

David Simcha has announced a couple of weeks ago that he wanted to work on making the D's GC partially precise (only the heap). I was planning to do it myself eventually because it looked like something doable with not much work that could yield a big performance gain, and particularly useful to avoid memory leaks due to false pointers (which can keep huge blocks of data artificially alive). But I didn't had the time and I had other priorities.

Anyway, after some discussion, he finally announced he got the patch, which he added as a bug report. The patch is being analyzed for inclusion, but the main problem now is that it is not integrated with the new operator, so if you want to get precise heap scanning, you have to use a custom function to allocate (that creates a map of the type to allocate at compile-time to pass the information about the location of the pointers to the GC).

I'm glad David could work on this and I hope this can be included in D2, since is a long awaited feature of the GC.

Update

David Schima has been just added to the list of Phobos developers. Maybe he can integrate his work on associative arrays too.

Te juro que no tengo parientes políticos!

by Leandro Lucarella on 2009- 11- 04 20:54 (updated on 2009- 11- 04 20:54)
tagged burocracia, es, política - with 0 comment(s)

Hoy fui al banco a hacer un trámite y entre papeleta para firmar que va y que viene me preguntan:

¿Tenés algún pariente en política?

Realmente no sabía si de pronto estaba en 1976 o si sólo me estaba dando charla, pero no, cuando le dije incómodamente (no, no voy a poner de nuevo mi cara de WTF) que creía que no (yo que sé si tengo un tío tercero que sea gobernador de Garompia!), me dio un papel para que se lo firmara.

No tuve tiempo de leer mucho lo que decía el papel ahí pero me pareció tan bizarro que le pedí una copia para leerlo en casa, y resultó ser una declaración jurada para la (y cito) "Prevención del lavado de dinero y otras actividades ilícitas".

Si son curiosos, acá tienen las fotos del documento:

1 2 3

Me gustaría ser naïf y pensar que sirve para algo...

Rompé, Pepe, rompé!

by Leandro Lucarella on 2009- 11- 04 00:44 (updated on 2009- 11- 04 00:44)
tagged es, fotonovela, personal, rompe pepe rompe - with 3 comment(s)

Warning

Post laaaargo ;)

Tal vez hayan notado que hace varios días que estoy bastante silencioso (exceptuando el exabrupto de hoy ;). Esto se debe básicamente a un problemita en casa. La historia es más o menos así (en forma de fotonovela):

Un día notamos que la pared empezó a brotarse levemente, mostrando algunas burbujitas:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/01-burbujitas.mini.jpg

Luego de algunos días se convirtieron en algo que nos hizo pensar que podríamos llegar a tener un problemita de humedad:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/02-humedad.mini.jpg

Decidimos llamar a algunos plomeros para ver que onda. Uno, llamémoslo José, no tenía ni la más pálida idea de lo que estaba haciendo, decía que empezaba a romper donde había salido la primer burbuja (casi arriba de todo), aunque claramente no había ningún caño ahí. Gracias, José, seguí participando...

Otro, llamémoslo Guillermo, pareció levemente más serio; dijo que él sospechaba del baño (que está arriba), cuyo desagote pasa verticalmente por la esquina de la cocina:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/03-esquina.mini.jpg

Yo dudé mucho que sea el baño, pero (ya les adelanto que) luego demostró tener razón.

Recomendó esperar algunos días antes de hacer nada para ver si se veía humedad en otro lado porque si tenía que hacer algo ahora era empezar a romper a ciegas. Así que eso hicimos...

El tiempo pasó y la humedad aumentó, pero sólo un poco. En esos días de casualidad descubrí una tapa de inspección del caño de desagote:

04 05

Pero a simple vista no detecté nada raro.

Cansados de esperar decidimos llamar a Guillermo nuevamente para que empiece a picar; en principio al lado del calefón, que parecían ser el caño más cercano.

Vino (dos horas tarde, flojo Guille!) y justo antes de empezar a picar se me ocurrió mencionarle la tapa. Dudó bastante en mirarla pero al final lo hizo, y metiendo la mano con más coraje plomeril de que yo pude tener, parece que no estaba tan seco como se veía de lejos, así que las sospechas volvieron al baño.

Para confirmarlas, hicimos un experimento: llenar la bañera con el tapón puesto y luego desagotar. Y pum! Empezamos a ver una tímidas gotas:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/06-gotas-abajo.mini.jpg

Con cara de apesadumbrado me dio la noticia:

Seguramente es la rejilla y para arreglarlo y que quede bien hay que cambiar todos los caños del baño [*]; y, tal vez, sacar la bañera [†], aunque no creo.

Luego de ponerle cara de WTF:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/07-cara-wtf2.mini.jpg

Dije que iba a pensarlo y se fue sin romper nada.

Pensamos que tal vez llamar a otro podría ser una buena idea y lo hicimos. Llamamos a un plomero, llamémoslo Domingo, que nos pasó un arquitecto conocido con la esperanza de que sea más mejor. Pero vino, pego un golpe de mirada a la pared de la concina y dijo:

Ehhh, sí, calculale 500 pesos.

Le traté de sacar más detalles sobre lo que pensaba o lo que había que hacer, preguntándole si quería al menos pasar a ver el baño pero no hubo caso. La verdad que no sé ni para que vino, para tirar un número al azar podría haberlo hecho por teléfono :S.

Otro no conseguí, así que el dilema era arriesgarse con un ladri conocido potencialmente más barato o con un desconocido que sonaba convincente pero que quería básicamente hacer el baño (con menos de 10 años de antigüedad) de nuevo.

Llamé a Guillermo y luego de intentar convencerlo de arreglarlo logró convencerme él a mí de que no había forma de arreglarlo bien que no sea cambiar los caños, así que podridos del problema, cerramos los ojos y apretamos Enter [‡].

Y así fue como vino Guille (de nuevo 2 horas tarde -.-) y empezó a buscar petróleo en nuestro (único) baño, con la colaboración del bidé, que gentilmente dio un paso al costado:

08 09

El caño apareció, pero la humedad no:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/10-agujero-caño.mini.jpg

Por lo que Guille me dijo:

Seguramente va a haber que levantar la bañera.

Cara de WTF nuevamente:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/07-cara-wtf2.mini.jpg

Pero a esa altura ya habíamos decidido ir por el todo o nada pensando lo menos posible. De todas formas para descartar un problema reparable desde la cocina decidimos romper la esquina superior para ver si ahí ya había humedad (cosa que yo quise hacer desde que me enteré que chorreaba por ese desagüe pero que no había hecho porque a todos les parecía absurdo).

Y así Guille Picapiedra rompió:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/11-guille.mini.jpg

Y la gota se vio:

12 13

Por lo que lo inevitable de pronto se vio... bueno, inevitable. Me dijo al pasar que iba a escuchar ruido porque seguramente ahí caía un desagüe del vecino también a lo que no le di importancia (porque consideré poco probable). Quedamos en que le daba un adelanto en el día (tenía que ir a sacar plata porque no tenía encima) y que al día siguiente ya venía con más gente (que hoy le habían fallado, por eso supuestamente había llegado tarde) a básicamente romper todo.

Al ratito que se fue, efectivamente escuche un ruido de agua en el caño y al mirar no solo veo que gotea, también veo que hay vapor. Dos descubrimientos importantes: el vecino se baña y el caño está roto en una parte común (probable en la T que los une).

Cara de WTF una vez más porque para colmo es un vecino con el que no hay la mejor onda:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/07-cara-wtf2.mini.jpg

Empiezo a freak-outear un poco porque a todo el quilombo se suma hablar, negociar y coordinar con el vecino. Además el razonamiento es, si la única forma de arreglar esto es cambiando todos los caños porque un arreglo no queda bien, el vecino también tiene que cambiar todos los caños? De pronto la solución no escala (cosa que ya nos había advertido mucha gente con frases del estilo "decile que vaya a cambiar todos los caños a su casa!", pero por ir a la segura, y dado que al arquitecto conocido no le pareció descabellado cambiar todo, decidimos ignorar).

Resumen: hablo con el vecino y quedamos en que él averigua por su lado, contemplando la posibilidad de romper en su casa (se portó bien para la poca onda que le venía poniendo en el día a día). Guillote Picapiedra, queriendo hacer el baño de nuevo a toda costa, se ofende luego de que en principio le suspenda el trabajo porque tenía que esperar al vecino. Al irse indignado acusándonos de tomarle el pelo por suspender el trabajo y dada su imposibilidad de poder justificar la necesidad de cambiar todos los caños de mi lado y ninguno del vecino (contando con el apoyo moral y logístico de mi viejo, al que no pudo chamuyar tanto como a mí porque sabe un poco más de estas cosas), se ganó la roja [§]. Consigo otro plomero, llamémoslo Ricardo, que me cayó mucho mejor y propuso romper el techo de la cocina en vez del baño (cosa que también se me había cruzado por la cabeza a mí en el principio pero que no prosperó por descabellada) y me dijo que se podía reparar sin cambiar nada (jamás se le cruzó por la cabeza cambiar todos los caños ni levantar la bañera).

Al final el vecino decepcionó y se hizo inhubicable por el resto de la semana. Cuando finalmente lo ubiqué el sábado a la noche me dijo con cara de ouch!, sin abrirme siquiera la puerta (era un día de lluvia y me estaba mojando afuera), que no había averiguado nada. Con las pelotas llenas como el balde:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/14-balde.mini.jpg

Porque no deja de gotear:

https://www.llucax.com.ar:443/blog/posts/2009/11/03-rompé-pepe-rompé/15-gota-trapo.mini.jpg

Decidimos empezar rompiendo de nuestro lado y si se complica volver a hablar. Ahora estamos esperando que Ricardo sea todo lo bueno que parece y nos llame cuando este libre para empezar.

Mientras tanto seguiremos viviendo con un crater en el baño y cambiando el baldecito todos los días así nuestra pared deja de chupar tanta humedad.

Continuará... (?)

[*]~ $2000.
[†]Otros ~ $2000 si se rompe.
[‡]Antes de esto estuve haciendo mucho research sobre plomería, metiéndome en tema, haciendo pruebas abriendo y cerrando cuanta canilla hay en mi casa auditando la gotera, pero no quiero aburrirlos con los detalles ;)
[§]No explícitamente, le pagué por los servicios prestados y lo despaché con el típico "cualquier cosa te llamo". A los dos días me llamó él, diciendo que le habían dejado un mensajito que había llamado un Leandro (yeah! right...), preguntando tímidamente como me había ido con el vecino, con lo que aproveché para comunicarle su estado oficial de expulsado.

Ginóbili San

by Leandro Lucarella on 2009- 11- 03 09:51 (updated on 2009- 11- 03 12:32)
tagged basketball, bat, es, ginobili, video, youtube - with 0 comment(s)

Una vez el señor Miyagi (que en paz descanse) dijo que si uno puede cazar una mosca con palitos chinos, puede conseguir cualquier cosa. No sé si lo que hizo Emanuel Ginóbili es lo mismo, pero está muy cerca.

Resulta que en medio de un partido apareció un murciélago, y mientras un negro de 2 metros y 100 kilos lo esquivaba como una nena, el hijo de puta del Manu se agazapó y lo cazó de un manotazo. Un gato.

Si no me creen acá tienen el video:

Más allá de que no comparto (la crueldad de matarlo [*] al pedo, pobre murcielaguito, y la crueldad del público de festejar), denota una habilidad inhumana en ese muchacho.

Ahora va a tener que pagar con 8 inyecciones antirabia, así que como dice en su facebook, aprendió la lección y no lo va a hacer más, ya demostró que la habilidad la tiene y que puede conseguir cualquier cosa =)

Update

La noticia es un poco vieja, pero la fecha original en la que pasó no es poca cosa. Fue el 31 de octubre (Halloween).

[*]Según dice, al murciélago lo llevaron afuera y salió volando.

The D Programming Language

by Leandro Lucarella on 2009- 10- 29 12:37 (updated on 2009- 10- 29 12:37)
tagged andrei alexandrescu, book, d, en, the d programming language - with 1 comment(s)

https://www.llucax.com.ar:443/blog/posts/2009/10/29-the-d-programming-language.jpg

The version 2.0 of D will be released in sync with the classic book titled after the language, in this case, The D Programming Language, written by the Andrei Alexandrescu. You can follow the progress of the book looking at his home page, where a words and pages counter and a short term objective are regularly updated.

He posted a little introductory excerpt of the book a while ago and yesterday he published a larger excerpt, the whole chapter 4 about arrays, associative arrays and strings.

If you don't know much about D, it could be a good way to take a peek.

Subdownloader

by Leandro Lucarella on 2009- 10- 25 18:41 (updated on 2009- 10- 25 18:41)
tagged es, movie, software, subdownloader, subtitle, tv - with 0 comment(s)

Subdownloader es otro gran invento para la gente que mirá TV en la compu. Como el nombre lo indica, este simpático programa facilita la tarea de bajar (y subir!) subtítulos utilizando un server que tenga una API compatible con opensubtitles.org.

Particularmente útil es que los subtítulos los busca calculando un hash del archivo, así que se pueden olvidar de los problemas de bajar un subtítulo para una versión que no corresponda.

LLVM 2.6

by Leandro Lucarella on 2009- 10- 24 18:30 (updated on 2009- 10- 24 18:30)
tagged d, en, llvm, release, software - with 0 comment(s)

Just in case you're not that well informed, Chris Lattner has just announced the release of LLVM 2.6. Enjoy!

War videos

by Leandro Lucarella on 2009- 10- 23 00:04 (updated on 2009- 10- 23 00:04)
tagged animation, battleground, en, food, map, video, war, youtube - with 0 comment(s)

Here are two very sad videos about wars.

The first is a representation of the battles in the last 1000 years as explosions in a world map. The size of the explosion is proportional to the number of deaths.

I guess it's missing a lot of small battles because you can't see any explosions in very big regions (like Africa, Latin America and India) until some empire tries to conquer them. I'm sorry if I depressed you too much.

The second video at least is cute if you forget what is it really about. Is an animation of food to represent several armed conflicts. Each country is represented by a regional food (you can see the cheat sheet if you get lost).

Found at No Puedo Creer. Lots of interesting stuff there (in Spanish though).

MIT Indoor Autonomous Helicopter

by Leandro Lucarella on 2009- 10- 21 15:26 (updated on 2009- 10- 21 15:26)
tagged en, helicopter, mit, robot, robotics - with 0 comment(s)

See this nice video.

This is the complete platform for indoor autonomous flight, developed under Nick Roy in the Robust Robotics Group at CSAIL.

KLEE, automatically generating tests that achieve high coverage

by Leandro Lucarella on 2009- 10- 20 11:20 (updated on 2009- 10- 20 11:20)
tagged coverage, d, en, klee, llvm, software, test, vm - with 2 comment(s)

This is the abstract of the paper describing KLEE, a new LLVM sub-project announced with the upcoming 2.6 release:

We present a new symbolic execution tool, KLEE, capable of automatically generating tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. We used KLEE to thoroughly check all 89 stand-alone programs in the GNU COREUTILS utility suite, which form the core user-level environment installed on millions of Unix systems, and arguably are the single most heavily tested set of open-source programs in existence. KLEE-generated tests achieve high line coverage — on average over 90% per tool (median: over 94%) — and significantly beat the coverage of the developers' own hand-written test suites. When we did the same for 75 equivalent tools in the BUSYBOX embedded system suite, results were even better, including 100% coverage on 31 of them. We also used KLEE as a bug finding tool, applying it to 452 applications (over 430K total lines of code), where it found 56 serious bugs, including three in COREUTILS that had been missed for over 15 years. Finally, we used KLEE to cross-check purportedly identical BUSYBOX and COREUTILS utilities, finding functional correctness errors and a myriad of inconsistencies.

I have to try this...

Ricardo Garmendia, el gaucho sónico

by Leandro Lucarella on 2009- 10- 19 21:22 (updated on 2009- 10- 19 21:22)
tagged cha cha cha, es, gaucho sónico, humor, music, video, youtube - with 0 comment(s)

Abro un espacio para la difusión de música experimental, presentado a Ricardo Garmendia, más conocido como El Gaucho Sónico.

Note

Ya sé que es viejo, pero nunca está de más recordarlo =)

Anti-love song

by Leandro Lucarella on 2009- 10- 17 01:01 (updated on 2009- 10- 17 01:01)
tagged en, lyrics, music, song, the beautiful south - with 0 comment(s)

I always found fascinating the mixture of beauty and terror that The Beautiful South is capable of =P

For instance, read the lyrics from Something That You Said from the album 0898 Beautiful South. Here are some fragments of the lyrics:

The perfect love song it has no words it only has death threats
And you can tell a classic ballad by how threatening it gets
So if you walk into your house and she's cutting up your mother
She's only trying to tell you that she loves you like no other
No other, she loves you like no other.
[...]
The perfect love has no emotions, it only harbours doubt
And if she fears your intentions she will cut you out
So do not raise your voice and do not shake your fist
Just pass her the carving knife, if that's what she insists
[...]
The perfect kiss is dry as sand and doesn't take your breath
The perfect kiss is with the boy that you've just stabbed to death

But please, go and read the full lyrics first.

Now try to picture how this song would sound like (if you don't already know it, of course =). You might think it will sound like a creepy death metal band, but no. You can hear 30 seconds of the song at last.fm to know how it really sounds.

The song is awfully peaceful, and the voice is Briana Corrigan is incredibly beautiful. But what it makes this a great song for me is the contrast between music and lyrics. They have plenty of songs using this resource and a lot of irony (for example, the more popular Song For Whoever).

For those who don't know anything about this band, it was formed by two ex-members of The Housemartins (I hope you know them =).

pybugz, a python and command line interface to Bugzilla

by Leandro Lucarella on 2009- 10- 16 11:14 (updated on 2009- 10- 16 11:14)
tagged bugzilla, cli, d, en, pybugz, python, software - with 0 comment(s)

Tired of the clumsy Bugzilla web interface? Meet pybugz, a command line interface for Bugzilla.

An example workflow from the README file:

$ bugz search "version bump" --assigned liquidx@gentoo.org

 * Using http://bugs.gentoo.org/ ..
 * Searching for "version bump" ordered by "number"
 101968 liquidx net-im/msnlib version bump
 125468 liquidx version bump for dev-libs/g-wrap-1.9.6
 130608 liquidx app-dicts/stardict version bump: 2.4.7

$ bugz get 101968

 * Using http://bugs.gentoo.org/ ..
 * Getting bug 130608 ..
Title : app-dicts/stardict version bump: 2.4.7
Assignee : liquidx@gentoo.org
Reported : 2006-04-20 07:36 PST
Updated : 2006-05-29 23:18:12 PST
Status : NEW
URL : http://stardict.sf.net
Severity : enhancement
Reporter : dushistov@mail.ru
Priority : P2
Comments : 3
Attachments : 1

[ATTACH] [87844] [stardict 2.4.7 ebuild]

[Comment #1] dushistov@----.ru : 2006-04-20 07:36 PST
...

$ bugz attachment 87844

 * Using http://bugs.gentoo.org/ ..
 * Getting attachment 87844
 * Saving attachment: "stardict-2.4.7.ebuild"

$ bugz modify 130608 --fixed -c "Thanks for the ebuild. Committed to
portage"

D and open development model

by Leandro Lucarella on 2009- 10- 15 17:09 (updated on 2009- 10- 15 17:09)
tagged compiler, d, development model, dmd, druntime, en, phobos, software - with 6 comment(s)

Warning

Long post ahead =)

I'm very glad that yesterday DMD had the first releases (DMD 1.050 and DMD 2.035) with a decent revision history. It took some time to Walter Bright to understand how the open source development model works, and I think he still has a lot more to learn, but I have some hope now about the future of D.

Not much time ago, neither Phobos, DMD nor Druntime had revision control. Druntime didn't even exist, making D 1 split in two because of the Phobos vs Tango dichotomy. DMD back-end sources were not available either, and Walter Bright was the only person writing stuff (sometimes not because people didn't want to, but because he was too anal retentive to let them ;). It was almost impossible to make patches back then (your only chance was hacking GDC, which is pretty hard).

Now I can say that DMD, Phobos and Druntime have full source availability (DMD back-end is not free/libre though), almost all the parts of DMD have the sources published under a source control system. The core team has been expanded and even when Walter Bright is still in charge, at least 3 developers are now very committed to D: Andrei Alexandrescu (in charge of Phobos), Sean Kelly (in charge of Druntime) and Don Clugston (squashing DMD bugs at full speed, specially in the back-end). Other people are contributing patches in a regular basis. There were about 72 patches submitted to bugzilla before DMD was distributed with full source (72 patches in ~10 years) , since then, 206 patches were submitted (that is, 206 patches in less than 8 months).

But even with this great improvement, there is much left to do yet (and I'm talking only about the development model). This is a small list of what I think it's necessary to keep moving to a more open development model:

Releases

The release process should be improved. Me and other people are suggesting release candidates. This will allow people to test the new releases to find any regressions. As things are now, releases are not much different from a nightly build, except that you don't have one available every night :). People get very frustrated when downloading a new version of the compiler and things stop working, and this holds back front-end updates in other compilers, like LDC (which is frozen at 1.045 because of the regressions found in the next 5 versions).

I think Walter Bright is suffering from premature releasing too. Releases comes from nowhere, when nobody expects them. Nobody knows when a new compiler version will be released. I think that hurts the language reliability.

I think the releases should be more predictable. A release schedule (even when not very accurate, like in many other open source projects) gives you some peace of mind.

Peer review

Even when commits are fairly small now in DMD, I think they are far from ideal. Is very common to see unrelated changes in a commit (the classic example is the compiler version number being bumped in an bug fix). See revision 214 for example: the compiler version is bumped and there are some changes to the new JSON output, totally unrelated to bug 3401, which is supposed to fix; or revision 213, which announces the release of DMD 1.050 and DMD 2.035, introducing a bunch of changes that who knows what are supposed to do (well, they look like the introduction of the new type T[new], but that's not even documented in the release changelog :S). This is bad for several reasons:

  • Reviewing a patch with unrelated changes is hard.
  • If you want to fold in a individual patch (let's say, LDC guys want to fold a bug fix), you have a lot of junk to take care of.
  • If you want to do some sort of bisection to find a regression, you still have to figure out which is the group of related changes that introduced the regression.

I'm sure there are more...

Commit messages lacks a good description of the problem and the solution. Most commit messages in DMD are "bugzilla N". You have to go to the bugzilla bug to know what's all about. For example, Don's patches usually comes with very good and juicy information about the bug causes and why the patch fixes it (see an example). That is a good commit message. You can learn a lot about the code by reading well commented patches, which can lead to more contributions in the future.

Commits in Phobos can be even worse. The commits with a message "bugzilla N" are usually the good ones. There are 56 commits that have "minor" as the commit message. Yes, just "minor". That's pretty useless, it's very hard to review a patch when you don't know what is supposed to do. Commit messages are the base of peer reviewing, and peer reviewing is the base for high quality code.

So I think that D developers should focus a lot more in commit message. I know it can sound silly at first, but I think I would be a huge gain with too little effort.

Besides this, commits should be mailed to a newsgroup or mailing list to easy peer review. Now it's a little hard to make comments about a commit, you have to post the comment in the D newsgroup or make the comment by personal e-mail to the author. The former is not that bad but it's not easy to include context and people reading the comment will probably have to open a browser and search for the commented commit. This clearly make peer reviewing more difficult when the ideal would be to encourage it. The private mail is simply wrong because other people can't see the comments.

Source control and versioning

This one is tightly related to the previous two topics. Using a good DVCS can make help a lot too. Subversion has a lot of problems with branching, which makes releases harder too (as having a branch for each release is very painful). Is bad for commit messages too, because there is no real difference in branches and directories, so know every commit is duplicated (both changes for DMD 1 and 2 are included). It's not easy to cherry-pick single commits either, and you can't fix you commits if you messed up, which leads to a lot of commits of the style "Woops! Fix the typo in the previous commit.".

I'm sure both the release process and peer reviewing can be greatly improved by using a better DVCS.

Easy branching can also lead to a more fast evolving and reliable language. Yes, both are possible with branches. Now there are 2 branches: stable (D1) and experimental (D2). D1 is almost frozen and people is seeing less and less interest on it as it goes old, and D2 is too unstable for real use. Having some intermediate can be really helpful. For example, it has been announced that the concurrency model proposed by Bartosz Milewski will be not part of D2 because there is not enough time to implement it, since D2 should be release fairly soon as Andrei Alexandrescu is writing a book that has a deadline and the language has to be finalized by the time the book is published.

So concurrency (as AST macros) are delayed to D3. D2 is more than 2 years old, so one should expect that D3 will be not available in less than 5 years from now (assuming D2 would take 2.5 years and D3 would take the same). This might be too much time.

I think the language should adopt a model closer to Python, where a minor language version (with backward compatible improvements) is release every 1 ~ 1.5 years. Last mayor version took about 8 years, but considering how many new features Python included in minor versions that's not a big issue. The last mayor version was mostly a clean up of old stuff/nasty stuff, not huge changes to the language.

Licensing

I think the DMD back-end should have a better license. Personal use is simply not enough for a reference implementation of a language that wants to hit mainstream. If you plan to do business with it, not being able to patch the compiler if you need to and distribute it is not an option.

This is for the sake of DMD only, because other compilers (like LDC and GDC) are fully free/libre.

Conclusion

Some of the things I mention are really hard to change, as they modify how people work and imply learning new tools. But other are fairly easy, and can be done progressively (like providing release candidates and improving commits and commit messages).

I hope Walter Bright & Co. keep walking the openness road =)

LLVM developer meeting videos available

by Leandro Lucarella on 2009- 10- 15 10:55 (updated on 2009- 10- 15 13:35)
tagged clang, d, en, llvm, llvm developer meeting, video - with 0 comment(s)

Chris Lattner announced that the videos for the last LLVM developer meeting are now available. They are usually very interesting, so I'd recommend to watch them if you have some time.

Update

Big WTF and many anti-cool-points for Apple:

On Oct 15, 2009, at 8:29 AM, Anton Korobeynikov wrote:
[...]
> I'm a bit curious: is there any reason why are other slides / videos
> not available (it seems that the ones missing are from Apple folks)?

Unfortunately, we found out at the last minute that Apple has a rule
which prevents its engineers from giving video taped talks or
distributing slides.  We will hold onto the video and slide assets in
case this rule changes in the future.

-Chris

Fragment from a response to the announcement.

Mutt patched with NNTP support for Debian (and friends)

by Leandro Lucarella on 2009- 10- 14 01:01 (updated on 2009- 10- 14 01:01)
tagged d, debian, en, mutt, nntp, patch, ubuntu, vsevolod volkov - with 2 comment(s)

Did you ever wanted Mutt with NNTP support packed up for your Debian (or Debian-ish) box, but you are too lazy to do it yourself? Did you even tried to report a bug so the patch can be applied to the official Debian package but the maintainers told you they wont do it?

If so, this is a great day for you, because I did it and I'm giving it away with no charge in this one time only opportunity!!! =P

Seriously, I can understand why the maintainers don't want to support it officially, it a big patch and can be some work to fold it in. So I did it myself, and it turned out it's wasn't that bad.

I adjusted the patch maintained by Vsevolod Volkov to work on top of all the other patches included in the mutt-patched Debian package (the only conflicting patch is the sidebar patch and some files that doesn't exist because the patch should be applied after autotools files are generated and Debian apply the patches before that) and built the package using the latest Debian source (1.5.20-4).

You can find the source package and the binary packages for Debian unstable i386 here. You can find there the modified NNTP patch too.

If you have Ubuntu or other Debian based distribution, you can compile the binary package by downloading the files mutt_1.5.20-4luca1.diff.gz, mutt_1.5.20-4luca1.dsc and mutt_1.5.20.orig.tar.gz, then run:

$ sudo apt-get build-dep mutt
$ dpkg-source -x mutt_1.5.20-4luca1.dsc
$ cd mutt-1.5.20
$ dpkg-buildpackage -rfakeroot
$ cd ..
$ sudo dpkg -i mutt_1.5.20-4luca1_i386.deb \
        mutt-patched_1.5.20-4luca1_i386.deb

Now you can enjoy reading the D newsgroups and your favourite mailing lists via Gmane with Mutt without leaving the beauty of your packaging system. No need to thank me, I'm glad to be helpful ;)

No te cases ni te embarques

by Leandro Lucarella on 2009- 10- 13 13:15 (updated on 2009- 10- 13 13:15)
tagged - with 0 comment(s)

No seas gil, que es Martes 13.

Lessfs

by Leandro Lucarella on 2009- 10- 11 16:56 (updated on 2009- 10- 11 16:56)
tagged backup, data deduplication, en, fs, lessfs, linux - with 0 comment(s)

Lessfs is an open source data deduplication filesystem:

Data deduplication (often called "intelligent compression" or "single-instance storage") is a method of reducing storage needs by eliminating redundant data. [...] lessfs can determine if data is redundant by calculating an unique (192 bit) tiger hash of each block of data that is written. When lessfs has determined that a block of data needs to be stored it first compresses the block with LZO or QUICKLZ compression. The combination of these two techniques results in a very high overall compression rate for many types of data.

Україна має талант

by Leandro Lucarella on 2009- 10- 10 20:27 (updated on 2009- 10- 10 20:27)
tagged animation, en, kseniya simonova, music, sand - with 0 comment(s)

I'm not Ukrainian, I just like how weird foreign symbols looks like in my blog =P

Україна має талант means something like Ukraine's Got Talent and is where Kseniya Simonova fame comes from. It's indescribable what she does, you just have to see a video.

You might enjoy it (or understand it) a little more if you read about what's going on before actually seeing the videos.

Here is a fragment from a small article:

The appearance of a shy 24-year-old on a Ukrainian TV talent show this year has caused a nation to revisit its painful wartime past and is well on the way to becoming an international sensation.

About 13 million people watched Kseniya Simonova win Ukraine's Got Talent live with an extraordinary demonstration of "sand art". Most of them, according to reports, were weeping.

file:line VIM plug-in

by Leandro Lucarella on 2009- 10- 10 16:59 (updated on 2009- 10- 10 16:59)
tagged en, file:line, plugin, vim - with 0 comment(s)

This VIM script should be part of the official VIM distribution:

When you open a file:line, for instance when copying and pasting from an error from your compiler VIM tries to open a file with a colon in its name. With this little script in your plugins folder if the stuff after the colon is a number and a file exists with the name specified before the colon VIM will open this file and take you to the line you wished in the first place.

Link Time Optimization

by Leandro Lucarella on 2009- 10- 10 15:34 (updated on 2009- 10- 10 15:34)
tagged binutils, d, en, gcc, gdc, gold, ldc, llvm, lto - with 0 comment(s)

The upcoming LLVM 2.6 will include a plug-in for Gold to implement Link Time Optimization (LTO) using LLVM's LibLTO. There is a similar project for GCC, merged into the main trunk about a week ago. It will be available in GCC 4.5.

This is all fairly new, and will be not enabled by default in LLVM (I don't know what about GCC), but it will add a lot of new optimization oportunities in the future.

So people using LDC and GDC will probably be able to enjoy LTO in a near future =)

Volvió Seinfeld

by Leandro Lucarella on 2009- 10- 09 11:33 (updated on 2009- 10- 09 11:33)
tagged curb your enthusiasm, es, larry david, seinfeld, serie, series, tv - with 0 comment(s)

Bueno, casi. Es una parodia dentro de otra serie, Curb your Enthusiasm, de Larry David, la otra cara de Seinfeld, la invisible, que salió a la luz hace 7 años con una serie en esencia muy parecida (el tipo básicamente también hace de sí mismo y la serie está planteada como un pseudo reality show de su vida).

La cosa es que para esta temporada, su séptima, planea hacer volver a Seinfeld, pero claro, dentro de la ficción de Curb your Enthusiasm nomás (algo es algo ;)

La serie no es tan buena como Seinfeld (así que si no la conocen tampoco se hagan tantas ilusiones), pero tiene momentos muy buenos y personajes muy graciosos (para mí el mejor es la malhablada Susie, por lejos).

Campanas por la gripe A

by Leandro Lucarella on 2009- 10- 08 21:20 (updated on 2009- 10- 08 21:20)
tagged es, gripe, h1n1, oms, video - with 0 comment(s)

Acá ya está pasado de moda, pero el tema parece empezar a tener auge en el hemisferio norte.

Campanas por la gripe A es un video tan interesante como bizarro. Se trata de una especie de Scully (es médica y habla de conspiraciones de dimensiones globales) pero catalana y monja (!).

Más allá del componente WTF, es muy interesante (y para nada delirante, como supongo que lo hice sonar). Habla de la gripe A/H1N1 con mucha data objetiva y comenta irregularidades ya denunciadas.

Por ejemplo explica por qué la gripe pudo considerarse pandemia (porque se cambió la definición para no incluir la alta mortalidad como requisito).

Sé que es difícil de convencer a alguien de mirar un video de una hora de una monja hablando, pero si pueden háganlo =P

Stats for the basic GC

by Leandro Lucarella on 2009- 10- 08 20:08 (updated on 2009- 10- 08 20:08)
tagged basic, benchmark, d, dgc, dgcbench, en, gc, statistics - with 0 comment(s)

Here are some graphs made from my D GC benchmarks using the Tango (0.99.8) basic collector, similar to the naive ones but using histograms for allocations (time and space):

big_arrays rnd_data rnd_data_2 split tree

Some comments:

  • The Wasted space is the Uncommitted space (since the basic GC doesn't track the real size of the stored object).
  • The Stop-the-world time is the time all the threads are stopped, which is almost the same as the time spent scanning the heap.
  • The Collect time is the total time spent in a collection. The difference with the Stop-the-world time is almost the same as the time spent in the sweep phase, which is done after the threads have being resumed (except the thread that triggered the collection).

There are a few observations to do about the results:

  • The stop the world time varies a lot. There are tests where is almost unnoticeable (tree), tests where it's almost equals to the total collection time (rnd_data, rnd_data_2, split) and test where it's in the middle (big_arrays). I can't see a pattern though (like heap occupancy).
  • There are tests where it seems that collections are triggered for no reason; there is plenty of free space when it's triggered (tree and big_arrays). I haven't investigated this yet, so if you can see a reason, please let me know.

The Wire

by Leandro Lucarella on 2009- 10- 07 12:07 (updated on 2009- 10- 07 12:07)
tagged david simon, ed burns, es, serie, series, the wire, tv - with 0 comment(s)

Vean The Wire.

Un comentario de Hernán Casciari que para mí resume a la perfección la serie:

García Márquez decía que admiraba a los compositores de boleros porque podían contar una historia de amor en tres minutos mientras que a él le llevaba seiscientas páginas. Bien. CSI es un bolero. The Wire es El amor en los tiempos del cólera.

Y eso que no leí El amor en los tiempos del cólera ni escucho boleros =P

Si quieren leer algo más desarrollado sobre el tema, pueden leer el post original de donde saqué esa cita, u otros posts al respecto.

Yo lo único que les voy a decir, es que para mí está entre las mejores series de todos los tiempos; afuera parece ser una serie de culto y acá tengo la sensación de que no la conoce nadie. Cada temporada le va tocando el culo a un sector de la sociedad distinto (calle, sindicatos, política, educación y periodismo respectivamente) bajo el hilo conductor de la droga (tráfico, distribución y consumo), y lo hace con una altura, estilo y realismo que da miedo. O placer, si sos masoquista =)

Tucan {up,down}load manager for file hosting sites

by Leandro Lucarella on 2009- 10- 06 11:02 (updated on 2009- 10- 06 11:02)
tagged download, en, floss, python, software, tucan, upload - with 0 comment(s)

Meet Tucan:

https://www.llucax.com.ar:443/blog/posts/2009/10/tucan.png

Tucan is a free and open source application designed for automatic management of downloads and uploads at hosting sites like Rapidshare.

GDC resurrection

by Leandro Lucarella on 2009- 10- 05 11:31 (updated on 2009- 10- 05 11:31)
tagged compiler, d, en, floss, gcc, gdc, software - with 0 comment(s)

About a month ago, the GDC newsgroup started to get some activity when Michael P. and Vincenzo Ampolo started working on updating GCD. Yesterday they announced that they successfully merged the DMD frontend 1.038 and 2.015, and a new repository for GDC. They will be hanging on #d.gdc if you have any questions or want to help out.

So great news for the D ecosystem! Kudos for this two brave men! =)

YikeBike & Mini-Farthing

by Leandro Lucarella on 2009- 10- 02 23:33 (updated on 2009- 10- 02 23:33)
tagged bike, design, en, mini-farthing, yikebike - with 2 comment(s)

YikeBike, an implementation of a mini-farthing. Too bad is a propietary design...

DGC page is back

by Leandro Lucarella on 2009- 10- 02 13:17 (updated on 2009- 10- 02 13:17)
tagged d, dgc, en, self, web - with 0 comment(s)

I've migrated the wiki pages about DGC from Redmine to Sphinx.

The Yes Men

by Leandro Lucarella on 2009- 10- 02 00:05 (updated on 2009- 10- 02 00:05)
tagged activism, dvd, en, es, identity correction, movie, the yes men - with 0 comment(s)

English

Watch The Yes Men.

Identity Correction

Impersonating big-time criminals in order to publicly humiliate them. Targets are leaders and big corporations who put profits ahead of everything else.

Links:

Español

Vean The Yes Men.

Corrección de identidad

Hacerse pasar por grandes criminales con el fin de humillarlos públicamente. Los objetivos son líderes y grandes corporaciones que ponen las ganancias por sobre todo el resto.

Links:

TV Online

by Leandro Lucarella on 2009- 09- 30 21:38 (updated on 2009- 09- 30 21:38)
tagged es, live, program, streaming, tv - with 0 comment(s)

Para los que no tengan tele (como su seguro servidor, es decir, yo), si de vez en cuando extrañan un poco el zapping, sepan que existe Tivion. Ya le avisé al autor que agregue el streaming de Canal 7 para ver a Peter Capusotto [*] (aunque la calidad deja mucho que desear :S).

[*]En realidad no es más que un front-end para el Mplayer, pero igual está bueno, aunque sea como repositorio de canales.

Fantastic Photos of our Solar System

by Leandro Lucarella on 2009- 09- 30 11:27 (updated on 2009- 09- 30 11:27)
tagged en, es, flare, photo, solar system, sun - with 0 comment(s)

Cinépata

by Leandro Lucarella on 2009- 09- 29 22:54 (updated on 2009- 09- 29 22:54)
tagged chile, cine, corto, creative commons, documental, es, experimental, largo - with 0 comment(s)

Cinépata es un sitio chileno con varios films bajo licencia Creative Commons: largos, cortos, documentales, clips y experimentales (algunos argentinos, como Como un avión estrellado).

Se pueden ver online o bajar en varios formatos.

8-bit Lego Trip

by Leandro Lucarella on 2009- 09- 29 12:31 (updated on 2009- 09- 29 12:31)
tagged en, es, lego, link, short film, stop-motion, video, youtube - with 0 comment(s)

Feeds

by Leandro Lucarella on 2009- 09- 28 23:49 (updated on 2009- 09- 28 23:49)
tagged blitiri, blog, en, feed, self, tag - with 0 comment(s)

I found out that my blog software (blitiri) already support tag-specific feeds, you just have to some extra GET variable(s) to the URL, for example:

https://www.llucax.com.ar/blog/blog.cgi/atom?tag=en&tag=self

This URL will get you the posts with both tags: en and self. I've set up some common feeds at feedburner (en, es and D for now). Please, use those if you can (i.e. if you don't need a feed for other tags).

Espoiler TV

by Leandro Lucarella on 2009- 09- 28 16:22 (updated on 2009- 09- 28 16:22)
tagged es, espoiler, tv, web - with 2 comment(s)

Para la gente que suele ver series de TV en la PC, la agenda de Espoiler TV les puede resultar interesante (y el blog también).

World Digital Library

by Leandro Lucarella on 2009- 09- 28 00:27 (updated on 2009- 09- 28 00:27)
tagged en, es, unesco, web, world digital library - with 0 comment(s)

New home page and blog

by Leandro Lucarella on 2009- 09- 28 00:15 (updated on 2009- 09- 28 23:38)
tagged en, self - with 3 comment(s)

Finally I removed my Redmine instance because it was eating up all my (modest) server memory. For my home page I'm using mostly static pages, rendered from reStructuredText using Sphinx. It's not particularly nice, but it's simple and cheap :)

I was a little tired of posting to several blogs (my thesis blog about DGC, Mazziblog and 4am), so I decided to centralize things. From now on, I'll be posing just here, I guess :)

So I'm generalizing my ex-"thesis blog" to some kind of "planet Luca". The good news is I plan to post a lot more, the bad news is that probably the posts quality will decrease =P, because I want to use this blog as a kind of notebook. I hope what I post is useful and interesting to other people, but I can't promise anything.

I will try to post in English except when the topic makes no sense for non-Spanish speakers (or non-Argentine people :). You can subscribe to only English-only or Spanish-only posts using the en or es tags respectively.

Update

I'm sorry, but my blog doesn't support feeding a tag, so the en/es feeds are not working properly yet (they feed the whole blog content for now).

I'll let you know when this is fixed.

You can navigate the en and es tags in the web view though.

Update

Language-specific feeds (en/es) are now working ;)

In fact, you can get a feed for any (AND combination of) tags you want adding "tag"s GET variables to the atom URL. For example, you can receive only posts in English about garbage collection.

Life in hell

by Leandro Lucarella on 2009- 09- 06 18:24 (updated on 2009- 09- 06 18:24)
tagged asm, benchmark, d, debug, dgc, dgcbench, dil, en, gc, gdb, naive, statistics - with 0 comment(s)

Warning

Long post ahead =)

As I said before, debug is hell in D, at least if you're using a compiler that doesn't write proper debug information and you're writing a garbage collector. But you have to do it when things go wrong. And things usually go wrong.

This is a small chronicle about how I managed to debug a weird problem =)

I had my Naive GC working and getting good stats with some small micro-benchmarks, so I said let's benchmark something real. There is almost no real D applications out there, suitable for an automated GC benchmark at least [1]. Dil looked like a good candidate so I said let's use Dil in the benchmark suite!.

And I did. But Dil didn't work as I expected. Even when running it without arguments, in which case a nice help message like this should be displayed:

dil v1.000
Copyright (c) 2007-2008 by Aziz Köksal. Licensed under the GPL3.

Subcommands:
  help (?)
  compile (c)
  ddoc (d)
  highlight (hl)
  importgraph (igraph)
  python (py)
  settings (set)
  statistics (stats)
  tokenize (tok)
  translate (trans)

Type 'dil help <subcommand>' for more help on a particular subcommand.

Compiled with Digital Mars D v1.041 on Sat Aug 29 18:04:34 2009.

I got this instead:

Generate an XML or HTML document from a D source file.
Usage:
  dil gen file.d [Options]

Options:
  --syntax         : generate tags for the syntax tree
  --xml            : use XML format (default)
  --html           : use HTML format

Example:
  dil gen Parser.d --html --syntax > Parser.html

Which it isn't even a valid Dil command (it looks like a dead string in some data/lang_??.d files).

I ran Valgrind on it and detected a suspicious invalid read of size 4 when reading the last byte of a 13 bytes long class instance. I thought maybe the compiler was assuming the GC allocated block with size multiples of the word size, so I made gc_malloc() allocate multiples of the word size, but nothing happened. Then I thought that maybe the memory blocks should be aligned to a multiple of a word, so I made gc_malloc() align the data portion of the cell to a multiple of a word, but nothing.

Since Valgrind only detected that problem, which was at the static constructor of the module tango.io.Console, I though it might be a Tango bug, so I reported it. But it wasn't Tango's fault. The invalid read looked like a DMD 1.042 bug; DMD 1.041 didn't have that problem, but my collector still failed to run Dil. So I was back to zero.

I tried the Tango stub collector and it worked, so I tried mine disabling the collections, and it worked too. So finally I could narrow the problem to the collection phase (which isn't much, but it's something). The first thing I could think it could be wrong in a collection is that cells still in use are swept as if they were unused, so I then disabled the sweep phase only, and it kept working.

So, everything pointer to prematurely freed cells. But why my collector was freeing cells prematurely being so, so simple? I reviewed the code a couple of times and couldn't find anything evidently wrong. To confirm my theory and with the hope of getting some extra info, I decided to write a weird pattern in the swept cells and then check if that pattern was intact when giving them back to the mutator (the basic GC can do that too if compiled with -debug=MEMSTOMP). That would confirm that the swept memory were still in use. And it did.

The I tried this modified GC with memory stomp with my micro-benchmarks and they worked just fine, so I started to doubt again that it was my GC's problem. But since those benchmarks didn't use much of the GC API, I thought maybe Dil was using some strange features of making some assumptions that were only true for the current implementation, so I asked Aziz Köksal (Dil creator) and he pointed me to some portion of code that allocated memory from the C heap, overriding the operators new and delete for the Token struct. There is a bug in Dil there, because apparently that struct store pointers to the GC heap but it's not registered as a root, so it looks like a good candidate.

So I commented out the overridden new and delete operators, so the regular GC-based operators were used. But I still got nothing, the wrong help message were printed again. Then I saw that Dil was manually freeing memory using delete. So I decided to make my gc_free() implementation a NOP to let the GC take over of all memory management... And finally all [2] worked out fine! =)

So, the problem should be either my gc_free() implementation (which is really simple) or a Dil bug.

In order to get some extra information on where the problem is, I changed the Cell.alloc() implementation to use mmap to allocate whole pages, one for the cell's header, and one or more for the cell data. This way, could easily mprotect the cell data when the cell was swept (and un-mprotecting them when they were give back to the program) in order to make Dil segfault exactly where the freed memory was used.

I ran Dil using strace and this is what happened:

[...]
 (a)  write(1, "Cell.alloc(80)\n", 15)        = 15
 (b)  mmap2(NULL, 8192, PROT_READ|PROT_WRITE, ...) = 0xb7a2e000
[...]
 (c)  mprotect(0xb7911000, 4096, PROT_NONE)   = 0
      mprotect(0xb7913000, 4096, PROT_NONE)   = 0
[...]
      mprotect(0xb7a2b000, 4096, PROT_NONE)   = 0
      mprotect(0xb7a2d000, 4096, PROT_NONE)   = 0
 (d)  mprotect(0xb7a2f000, 4096, PROT_NONE)   = 0
      mprotect(0xb7a43000, 4096, PROT_NONE)   = 0
      mprotect(0xb7a3d000, 4096, PROT_NONE)   = 0
[...]
      mprotect(0xb7a6b000, 4096, PROT_NONE)   = 0
 (e)  mprotect(0xb7a73000, 4096, PROT_NONE)   = 0
 (f)  mprotect(0xb7a73000, 4096, PROT_READ|PROT_WRITE) = 0
      mprotect(0xb7a6b000, 4096, PROT_READ|PROT_WRITE) = 0
[...]
      mprotect(0xb7a3f000, 4096, PROT_READ|PROT_WRITE) = 0
 (g)  mprotect(0xb7a3d000, 4096, PROT_READ|PROT_WRITE) = 0
      --- SIGSEGV (Segmentation fault) @ 0 (0) ---
      +++ killed by SIGSEGV (core dumped) +++

(a) is a debug print, showing the size of the gc_malloc() call that got the address 0xb7a2e000. The mmap (b) is 8192 bytes in size because I allocate a page for the cell header (for internal GC information) and another separated page for the data (so I can only mprotect the data page and keep the header page read/write); that allocation asked for a new fresh couple of pages to the OS (that's why you see a mmap).

From (c) to (e) you can see a sequence of several mprotect, that are cells being swept by a collection (protecting the cells against read/write so if the mutator tries to touch them, a SIGSEGV is on the way).

From (f) to (g) you can see another sequence of mprotect, this time giving the mutator permission to touch that pages, so that's gc_malloc() recycling the recently swept cells.

(d) shows the cell allocated in (a) being swept. Why the address is not the same (this time is 0xb7a2f000 instead of 0xb7a2e000)? Because, as you remember, the first page is used for the cell header, so the data should be at 0xb7a2e000 + 4096, which is exactly 0xb7a2f000, the start of the memory block that the sweep phase (and gc_free() for that matter) was protecting.

Finally we see the program getting his nice SIGSEGV and dumping a nice little core for touching what it shouldn't.

Then I opened the core with GDB and did something like this [3]:

Program terminated with signal 11, Segmentation fault.
(a)  #0  0x08079a96 in getDispatchFunction ()
     (gdb) print $pc
(b)  $1 = (void (*)()) 0x8079a96 <getDispatchFunction+30>
     (gdb) disassemble $pc
     Dump of assembler code for function
     getDispatchFunction:
     0x08079a78 <getDispatchFunction+0>:  push   %ebp
     0x08079a79 <getDispatchFunction+1>:  mov    %esp,%ebp
     0x08079a7b <getDispatchFunction+3>:  sub    $0x8,%esp
     0x08079a7e <getDispatchFunction+6>:  push   %ebx
     0x08079a7f <getDispatchFunction+7>:  push   %esi
     0x08079a80 <getDispatchFunction+8>:  mov    %eax,-0x4(%ebp)
     0x08079a83 <getDispatchFunction+11>: mov    -0x4(%ebp),%eax
     0x08079a86 <getDispatchFunction+14>: call   0x80bccb4 <objectInvariant>
     0x08079a8b <getDispatchFunction+19>: push   $0xb9
     0x08079a90 <getDispatchFunction+24>: mov    0x8(%ebp),%edx
     0x08079a93 <getDispatchFunction+27>: add    $0xa,%edx
(c)  0x08079a96 <getDispatchFunction+30>: movzwl (%edx),%ecx
     [...]
     (gdb) print /x $edx
(d)  $2 = 0xb7a2f000

First, in (a), GDB tells where the program received the SIGSEGV. In (b) I print the program counter register to get a more readable hint on where the program segfaulted. It was at getDispatchFunction+30, so I disassemble that function to see that the SIGSEGV was received when doing movzwl (%edx),%ecx (moving the contents of the ECX register to the memory pointed to by the address in the register EDX) at (c). In (d) I get the value of the EDX register, and it's 0xb7a2f000. Do you remember that value? Is the data address for the cell at 0xb7a2e000, the one that was recently swept (and mprotected). That's not good for business.

This is the offending method (at dil/src/ast/Visitor.d):

Node function(Visitor, Node) getDispatchFunction()(Node n)
{
    return cast(Node function(Visitor, Node))dispatch_vtable[n.kind];
}

Since I can't get any useful information from GDB (I can't even get a proper backtrace [4]) except for the mangled function name (because the wrong debug information produced by DMD), I had to split that function into smaller functions to confirm that the problem was in n.kind (I guess I could figure that out by eating some more assembly, but I'm not that well trained at eating asm yet =). This means that the Node instance n is the one prematurely freed.

This is particularly weird, because it looks like the node is being swept, not prematurely freed using an explicit delete. So it seems like the GC is missing some roots (or there are non-aligned pointers or weird stuff like that). The fact that this works fine with the Tango basic collector is intriguing too. One thing I can come up with to explain why it works in the basic collector is because it makes a lot less collections than the naive GC (the latter is really lame =). So maybe the rootless object becomes really free before the basic collector has a chance to run a collection and because of that the problem is never detected.

I spent over 10 days now investigating this issue (of course this is not near a full-time job for me so I can only dedicate a couple of days a week to this =), and I still can't find a clear cause for this problem, but I'm a little inclined towards a Dil bug, so I reported one =). So we'll see how this evolves; for now I'll just make gc_free() a NOP to continue my testing...

[1]Please let me know if you have any working, real, Tango-based D application suitable for GC benchmarks (i.e., using the GC and easily scriptable to run it automatically).
[2]all being running Dil without arguments to get the right help message =)
[3]I have shortened the name of the functions because they were huge, cryptic, mangled names =). The real name of getDispatchFunction is _D3dil3ast7Visitor7Visitor25__T19getDispatchFunctionZ19getDispatchFunctionMFC3dil3ast4Node4NodeZPFC3dil3ast7Visitor7VisitorC3dil3ast4Node4NodeZC3dil3ast4Node4Node (is not much better when demangled: class dil.ast.Node.Node function(class dil.ast.Visitor.Visitor, class dil.ast.Node.Node)* dil.ast.Visitor.Visitor.getDispatchFunction!().getDispatchFunction(class dil.ast.Node.Node) =). The real name of objectInvariant is D9invariant12_d_invariantFC6ObjectZv and has no demagled name that I know of, but I guessed is the Object class invariant.
[4]

Here is what I get from GDB:

(gdb) bt
#0  0x08079a96 in getDispatchFunction ()
#1  0xb78d5000 in ?? ()
#2  0xb789d000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(function name unmangled and shortened for readbility)

Allocations graphs

by Leandro Lucarella on 2009- 08- 26 21:54 (updated on 2009- 08- 26 21:54)
tagged allocation, benchmark, d, dgc, dgcbench, en, gc, graph, naive, statistics - with 0 comment(s)

Here are a set of improved statistics graphs, now including allocation statistics. All the data is plotted together and using the same timeline to ease the analysis and comparison.

Again, all graphs (as the graph title says), are taken using the Naive GC (stats code still not public yet :) and you can find the code for it in my D GC benchmark repository.

This time the (big) graphs are in EPS format because I could render them in PNG as big as I wanted and I didn't had the time to fix that =S

big_arrays rnd_data rnd_data_2 shootout_binarytrees split startup tree

The graphs shows the same as in the previous post with the addition of allocation time (how long it took to perform the allocation) and space (how many memory has been requested), which are rendered in the same graph, and an histogram of cell sizes. The histogram differentiates cells with and without the NO_SCAN bit, which might be useful in terms on seeing how bad the effect of false positives could be.

You can easily see how allocation time peeks match allocations that triggered a collection for example, and how bad can it be the effect of false positives, even when almost all the heap (99.99%) has the NO_SCAN bit (see rnd_data_2).

Graphs

by Leandro Lucarella on 2009- 08- 18 00:26 (updated on 2009- 08- 18 00:26)
tagged benchmark, collection, d, dgc, dgcbench, en, gc, graph, naive, statistics - with 0 comment(s)

It's been exactly 3 months since the last post. I spent the last months writing my thesis document (in Spanish), working, and being unproductive because of the lack of inspiration =)

But in the last couple of days I decided to go back to the code, and finish the statistics gathering in the Naive GC (the new code is not published yet because it needs some polishing). Here are some nice graphs from my little D GC benchmark:

big_arrays rnd_data rnd_data_2 shootout_binarytrees split startup tree

The graphs shows the space and time costs for each collection in the programs life. The collection time is divided in the time spent in the malloc() that triggered the collection, the time spent in the collection itself, and the time the world has to be stopped (meaning the time all the threads were paused because of the collection). The space is measured before and after the collection, and the total memory consumed by the program is divided in 4 areas: used space, free space, wasted space (space that the user can't use but it's not used by the collector either) and overhead (space used by the collector itself).

As you can see, the naive collector pretty much sucks, specially for periods of lots of allocation (since it just allocated what it's asked in the gc_malloc() call if the collection failed).

The next step is to modify the Tango's Basic collector to gather the same data and see how things are going with it.

Naive GC fixes

by Leandro Lucarella on 2009- 05- 17 19:09 (updated on 2009- 05- 17 19:09)
tagged d, dgc, en, gc, ldc, naive, patch, statistics, tango - with 0 comment(s)

I haven't been posting very often lately because I decided to spend some time writing my thesis document (in Spanish), which was way behind my current status, encouraged by my code-wise bad weekend =P.

Alberto Bertogli was kind enough to review my Naive GC implementation and sent me some patches, improving the documentation (amending my tarzanesque English =) and fixing a couple of (nasty) bugs [1] [2].

I'm starting to go back to the code, being that LDC is very close to a new release and things are starting to settle a little, so I hope I can finish the statistics gathering soon.

Debug is hell

by Leandro Lucarella on 2009- 05- 04 00:24 (updated on 2009- 05- 04 00:24)
tagged d, debug, dgc, dmd, en, gold, ldc, parental advisory, rant, tango - with 8 comment(s)

Warning

Rant ahead.

If Matt Groeing would ever written a garbage collector I'm sure he would made a book in the Life in Hell series called Debug is Hell.

You can't rely on anything: unit tests are useless, they depend on your code to run and you can't get a decent backtrace ever using a debugger (the runtime calls seems to hidden to the debugger). I don't know if the last one is a compiler issue (I'm using DMD right now because my LDC copy broken =( ).

Add that to the fact that GNU Gold doesn't work, DMD doesn't work, Tango doesn't work [*] and LDC doesn't work, and that it's already hard to debug in D because most of the mainstream tools (gdb, binutils, valgrind) don't support the language (can't demangle D symbols for instance) and you end up with a very hostile environment to work with.

Anyway, it was a very unproductive weekend, my statistics gathering code seems to have some nasty bug and I'm not being able to find it.

PS: I want to apologize in advance to the developers of GNU Gold, DMD, Tango and LDC because they make great software, much less crappier than mine (well, to be honest I'm not so sure about DMD ;-P), it's just a bad weekend. Thank you for your hard work, guys =)

[*]Tango trunk is supposed to be broken for Linux

Statistics, benchmark suite and future plans

by Leandro Lucarella on 2009- 05- 01 22:43 (updated on 2009- 05- 01 22:43)
tagged benchmark, d, dgc, en, plan, statistics, todo - with 4 comment(s)

I'm starting to build a benchmark suite for D. My benchmarks and programs request was a total failure (only Leonardo Maffi offered me a small trivial GC benchmark) so I have to find my own way.

This is a relative hard task, I went through dsource searching for real D programs (written using Tango, I finally desisted in making Phobos work in LDC because it would be a very time consuming task) and had no much luck either. Most of the stuff there are libraries, the few programs are: not suitable for an automated benchmark suite (like games), abandoned or work with Phobos.

I found only 2 candidates:

I just tried dack for now (I tried MiniD a while ago but had some compilation errors, I have to try again). Web-GMUI seems like a nice maintained candidate too, but being a client to monitor other BitTorrent clients, seems a little hard to use in automated benchmarks.

For a better usage of the benchmark suite, I'm adding some statistics gathering to my Naive GC implementation, and I will add that too to the Tango basic GC implementation. I will collect this data for each and every collection:

  • Collection time
  • Stop-the-world time (time all the threads were suspended)
  • Current thread suspension time (this is the same as Collection time in both Naive and Tango Basic GC implementations, but it won't be that way in my final implementation)
  • Heap memory used by the program
  • Free heap memory
  • Memory overhead (memory used by the GC not usable by the program)

The three last values will be gathered after and before the collection is made.

Anyway, if you know any program that can be suitable for use in an automated benchmark suite that uses Tango, please, please let me know.

Naive Garbage Collector

by Leandro Lucarella on 2009- 04- 26 22:49 (updated on 2009- 04- 26 22:49)
tagged d, dgc, en, gc, howto, mark-sweep, naive, tango - with 2 comment(s)

I was working in a naive garbage collector implementation for D, as a way to document the process of writing a GC for D.

From the Naive Garbage Collector documentation:

The idea behind this implementation is to document all the bookkeeping and considerations that has to be taken in order to implement a garbage collector for D.

The garbage collector algorithm itself is extremely simple so focus can be held in the specifics of D, and not the algorithm. A completely naive mark and sweep algorithm is used, with a recursive mark phase. The code is extremely inefficient in order to keep the code clean and easy to read and understand.

Performance is, as expected, horrible, horrible, horrible (2 orders of magnitude slower than the basic GC for the simple Tango GC Benchmark) but I think it's pretty good as documentation =)

I have submitted the implementation to Tango in the hope that it gets accepted. A git repository is up too.

If you want to try it out with LDC, you have to put the files into the naive directory in tango/lib/gc and edit the file runtime/CMakeLists.txt and search/replace "basic" for "naive". Then you have to search for the line:

file(GLOB GC_D ${RUNTIME_GC_DIR}/*.d)

And replace it with:

file(GLOB GC_D ${RUNTIME_GC_DIR}/gc/*.d)

Comments and reviews are welcome, and please let me know if you try it =)

Immix mark-region garbage collector

by Leandro Lucarella on 2009- 04- 25 17:39 (updated on 2009- 04- 25 17:39)
tagged copying, d, dgc, en, gc, immix, mark-region, moving, paper, tracing - with 0 comment(s)

Yesterday Fawzi Mohamed pointed me to a Tango forums post (<rant>god! I hate forums</rant> =) where Keith Nazworth announces he wants to start a new GC implementation in his spare time.

He wants to progressively implement the Immix Garbage Collector.

I read the paper and it looks interesting, and it looks like it could use the parallelization I plan to add to the current GC, so maybe our efforts can be coordinated to leave the possibility to integrate both improvements together in the future.

A few words about the paper: the heap organization is pretty similar to the one in the current GC implementation, except Immix proposes that pages should not be divided in fixed-size bins, but do pointer bump variable sized allocations inside a block. Besides that, all other optimizations that I saw in the paper are somehow general and can be applied to the current GC at some point (but some of them maybe don't fit as well). Among these optimizations are: opportunistic moving to avoid fragmentation, parallel marking, thread-local pools/allocator and generations. Almost all of the optimizations can be implemented incrementally, starting with a very basic collector which is not very far from the actual one.

There were some discussion on adding the necessary hooks to the language to allow a reference counting based garbage collector in the newsgroup (don't be fooled by the subject! Is not about disabling the GC =) and weak references implementation. There's a lot of discussion about GC lately in D, which is really exciting!

Guaranteed finalization support

by Leandro Lucarella on 2009- 04- 19 14:03 (updated on 2009- 04- 19 14:03)
tagged d, dgc, en, finalization, specs, understanding the current gc - with 0 comment(s)

There was some discussion going on about what I found in my previous post. Unfortunately the discussion diverged a lot, and lots of people seems to defend not guaranteed finalization for no reason, or arguing that finalization is supposed to be used with RAII.

I find all the arguments very weak, at least for convincing me that the current specs are not broken (if finalizers shouldn't be used with objects with its lifetime determined by the GC, then don't let that happen).

The current specs allow a D implementation with a GC that don't call finalizers for collected objects at all! So any D program relying on that is actually broken.

Anyways, from all the possible solutions to this problem, I think the better is just to provide guaranteed finalization, at least at program exit. That is doable (and easily doable by the way).

I filed a bug report about this, but unfortunately, seeing how the discussion at the news group went, I'm very skeptic about this being fixed at all.

Object finalization

by Leandro Lucarella on 2009- 04- 18 12:18 (updated on 2009- 04- 18 12:18)
tagged d, dgc, en, finalization, specs, understanding the current gc - with 0 comment(s)

I'm writing a trivial naive (but fully working) GC implementation. The idea is:

  1. Improve my understanding about how a GC is written from the ground up
  2. Ease the learning curve for other people wanting to learn how to write a D GC
  3. Serve as documentation (will be fully documented)
  4. Serve as a benchmarking base (to see how better is an implementation compared to the dumbest and simplest implementation ever =)

There is a lot of literature on GC algorithms, but there is almost no literature of the particularities on implementing a GC in D (how to handle the stack, how finalize an object, etc.). The idea of this GC implementation is to tackle this. The collection and allocation algorithms are really simple so you can pay attention to the other stuff.

The exercise is already paying off. Implementing this GC I was able to see some details I missed when I've done the analysis of the current implementation.

For example, I completely missed finalization. The GC stores for each cell a flag that indicates when an object should be finalized, and when the memory is swept it calls rt_finalize() to take care of the business. That was easy to add to my toy GC implementation.

Then I was trying to decide if all memory should be released when the GC is terminated or if I could let the OS do that. Then I remembered finalization, so I realized I should at least call the finalizers for the live objects. So I went see how the current implementation does that.

It turns out it just calls a full collection (you have an option to not collect at all, or to collect excluding roots from the stack, using the undocumented gc_setTermCleanupLevel() and gc_gsetTermCleanupLevel() functions). So if there are still pointers in the static data or in the stack to objects with finalizers, those finalizers are never called.

I've searched the specs and it's a documented feature that D doesn't guarantee that all objects finalizers get called:

The garbage collector is not guaranteed to run the destructor for all unreferenced objects. Furthermore, the order in which the garbage collector calls destructors for unreference objects is not specified. This means that when the garbage collector calls a destructor for an object of a class that has members that are references to garbage collected objects, those references may no longer be valid. This means that destructors cannot reference sub objects.

I knew that ordering was not guaranteed so you can't call other finalizer in a finalizer (and that make a lot of sense), but I didn't knew about the other stuff. This is great for GC implementors but not so nice for GC users ;)

I know that the GC, being conservative, has a lot of limitations, but I think this one is not completely necessary. When the program ends, it should be fairly safe to call all the finalizers for the live objects, referenced or not.

In this scheme, finalization is as reliable as UDP =)

Understanding the current GC, conclusion

by Leandro Lucarella on 2009- 04- 11 16:36 (updated on 2009- 04- 11 16:36)
tagged book, conclusion, d, dgc, druntime, en, gc, mark-sweep, understanding the current gc - with 0 comment(s)

Now that I know fairly deeply the implementation details about the current GC, I can compare it to the techniques exposed in the GC Book.

Tri-colour abstraction

Since most literature speaks in terms of the tri-colour abstraction, now it's a good time to translate how this is mapped to the D GC implementation.

As we all remember, each cell (bin) in D has several bits associated to them. Only 3 are interesting in this case:

  • mark
  • scan
  • free (freebits)

So, how we can translate this bits into the tri-colour abstraction?

Black

Cells that were marked and scanned (there are no pointer to follow) are coloured black. In D this cells has the bits:

mark = 1
scan = 0
free = 0
Grey

Cells that has been marked, but they have pointers to follow in them are coloured grey. In D this cells has the bits:

mark = 1
scan = 1
free = 0
White

Cells that has not been visited at all are coloured white (all cells should be colored white before the marking starts). In D this cells has the bits:

mark = 0
scan = X

Or:

free = 1

The scan bit is not important in this case (but in D it should be 0 because scan bits are cleared before the mark phase starts). The free bit is used for the cells in the free list. They are marked before other cells get marked with bits mark=1 and free=1. This way the cells in the free list don't get scanned (mark=1, scan=0) and are not confused with black cells (free=1), so they can be kept in the free list after the mark phase is done. I think this is only necessary because the free list is regenerated.

Improvements

Here is a summary of improvements proposed by the GC Book, how the current GC is implemented in regards to this improvements and what optimization opportunities can be considered.

Mark stack

The simplest version of the marking algorithm is recursive:

mark(cell)
    if not cell.marked
        cell.marked = true
        for child in cell.children
            mark(child)

The problem here is, of course, stack overflow for very deep heap graphs (and the space cost).

The book proposes using a marking stack instead, and several ways to handle stack overflow, but all these are only useful for relieving the symptom, they are not a cure.

As a real cure, pointer reversal is proposed. The idea is to use the very same pointers to store the mark stack. This is constant in space, and needs only one pass through the help, so it's a very tempting approach. The bad side is increased complexity and probably worse cache behavior (writes to the heap dirties the entire heap, and this can kill the cache).

Current implementation

The D GC implementation does none of this. Instead it completes the mark phase by traversing the heap (well, not really the heap, only the bit sets) in several passes, until no more data to scan can be found (all cells are painted black or white). While the original algorithm only needs one pass through the heap, this one need several. This trades space (and the complexity of stack overflow handling) for time.

Optimization opportunities

This seems like a fair trade-off, but alternatives can be explored.

Bitmap marking

The simplest mark-sweep algorithm suggests to store marking bits in the very own cells. This can be very bad for the cache because a full traversal should be done across the entire heap. As an optimization, a bitmap can be used, because they are much small and much more likely to fit in the cache, marking can be greatly improved using them.

Current implementation

Current implementation uses bitmaps for mark, scan, free and other bits. The bitmap implementation is GCBits and is a general approach.

The bitmap stores a bit for each 16 bytes chunks, no matter what cell size (Bins, or bin size) is used. This means that 4096/16 = 256 bits (32 bytes) are used for each bitmap for every page in the GC heap. Being 5 bitmaps (mark, scan, freebits, finals and noscan), the total spaces per page is 160 bytes. This is a 4% space overhead in bits only.

This wastes some space for larger cells.

Optimization opportunities

The space overhead of bitmaps seems to be fairly small, but each byte counts for the mark phase because of the cache. A heap with 64 MiB uses 2.5 MiB in bitmaps. Modern processors come with about that much cache, and a program using 64 MiB doesn't seems very rare. So we are pushing the limits here if we want our bitmaps to fit in the cache to speed up the marking phase.

I think there is a little room for improvement here. A big object, lets say it's 8 MiB long, uses 640 KiB of memory for bitmaps it doesn't need. I think some specialized bitmaps can be used for large object, for instance, to minimize the bitmaps space overhead.

There are some overlapping bits too. mark=0 and scan=1 can never happen for instance. I think it should be possible to use that combination for freebits, and get rid of an entire bitmap.

Lazy sweep

The sweep phase is done generally right after the mark phase. Since normally the collection is triggered by an allocation, this can be a little disrupting for the thread that made that allocation, that has to absorb all the sweeping itself.

Another alternative is to do the sweeping incrementally, by doing it lazy. Instead of finding all the white cells and linking them to the free list immediately, this is done on each allocation. If there is no free cells in the free list, a little sweeping is done until new space can be found.

This can help minimize pauses for the allocating thread.

Current implementation

The current implementation does an eager sweeping.

Optimization opportunities

The sweeping phase can be made lazy. The only disadvantage I see is (well, besides extra complexity) that could make the heap more likely to be fragmented, because consecutive requests are not necessarily made on the same page (a free() call can add new cells from another page to the free list), making the heap more sparse, (which can be bad for the cache too). But I think this is only possible if free() is called explicitly, and this should be fairly rare in a garbage collected system, so I guess this could worth trying.

Lazy sweeping helps the cache too, because in the sweep phase, you might trigger cache misses when linking to the free list. When sweeping lazily, the cache miss is delayed until it's really necessary (the cache miss will happen anyway when you are allocating the free cell).

Conclusion

Even when the current GC is fairly optimized, there is plenty of room for improvements, even preserving the original global design.

Understanding the current GC, the end

by Leandro Lucarella on 2009- 04- 11 01:46 (updated on 2009- 04- 15 01:10)
tagged d, dgc, druntime, en, gc, mark, mark-sweep, sweep, understanding the current gc - with 0 comment(s)

In this post I will take a closer look at the Gcx.mark() and Gcx.fullcollect() functions.

This is a simplified version of the mark algorithm:

mark(from, to)
    changes = 0
    while from < to
        pool = findPool(from)
        offset = from - pool.baseAddr
        page_index = offset / PAGESIZE
        bin_size = pool.pagetable[page_index]
        bit_index = find_bit_index(bin_size, pool, offset)
        if not pool.mark.test(bit_index)
            pool.mark.set(bit_index)
            if not pool.noscan.test(bit_index)
                pool.scan.set(bit_index)
                changes = true
        from++
        anychanges |= changes // anychanges is global

In the original version, there are some optimizations and the find_bit_index() function doesn't exist (it does some bit masking to find the right bit index for the bit set). But everything else is pretty much the same.

So far, is evident that the algorithm don't mark the whole heap in one step, because it doesn't follow pointers. It just marks a consecutive chunk of memory, assuming that pointers can be at any place in that memory, as long as they are aligned (from increments in word-sized steps).

fullcollect() is the one in charge of following pointers, and marking chunks of memory. It does it in an iterative way (that's why mark() informs about anychanges (when new pointer should be followed to mark them, or, speaking in the tri-colour abstraction, when grey cells are found).

fullcollect() is huge, so I'll split it up in smaller pieces for the sake of clarity. Let's see what are the basic blocks (see the second part of this series):

fullcollect()
    thread_suspendAll()
    clear_mark_bits()
    mark_free_list()
    rt_scanStaticData(mark)
    thread_scanAll(mark, stackTop)
    mark_root_set()
    mark_heap()
    thread_resumeAll()
    sweep()

Generaly speaking, all the functions that have some CamelCasing are real functions and the ones that are all_lowercase and made up by me.

Let's see each function.

thread_suspendAll()
This is part of the threads runtime (found in src/common/core/thread.d). A simple peak at it shows it uses SIGUSR1 to stop the thread. When the signal is caught it pushes all the registers into the stack to be sure any pointers there are scanned in the future. The threads waits for SIGUSR2 to resume.
clear_mark_bits()
foreach pool in pooltable
    pool.mark.zero()
    pool.scan.zero()
    pool.freebits.zero()
mark_free_list()
foreach n in B_16 .. B_PAGE
    foreach node in bucket
        pool = findPool(node)
        pool.freebits.set(find_bit_index(pool, node))
        pool.mark.set(find_bit_index(pool, node))
rt_scanStaticData(mark)
This function, as the name suggests, uses the provided mark function callback to scan the program's static data.
thread_scanAll(mark, stackTop)
This is another threads runtime function, used to mark the suspended threads stacks. I does some calculation about the stack bottom and top, and calls mark(bottom, top), so at this point we have marked all reachable memory from the stack(s).
mark_root_set()
mark(roots, roots + nroots)
foreach range in ranges
    mark(range.pbot, range.ptop)
mark_heap()

This is where most of the marking work is done. The code is really ugly, very hard to read (mainly because of bad variable names) but what it does it's relatively simple, here is the simplified algorithm:

// anychanges is global and was set by the mark()ing of the
// stacks and root set
while anychanges
    anychanges = 0
    foreach pool in pooltable
        foreach bit_pos in pool.scan
            if not pool.scan.test(bit_pos)
                continue
            pool.scan.clear(bit_pos) // mark as already scanned
            bin_size = find_bin_for_bit(pool, bit_pos)
            bin_base_addr = find_base_addr_for_bit(pool, bit_pos)
            if bin_size < B_PAGE // small object
                bin_top_addr = bin_base_addr + bin_size
            else if bin_size in [B_PAGE, B_PAGEPLUS] // big object
                page_num = (bin_base_addr - pool.baseAddr) / PAGESIZE
                if bin == B_PAGEPLUS // search for the base page
                    while pool.pagetable[page_num - 1] != B_PAGE
                        page_num--
                n_pages = 1
                while page_num + n_pages < pool.ncommitted
                        and pool.pagetable[page_num + n_pages] == B_PAGEPLUS
                    n_pages++
                bin_top_addr = bin_base_addr + n_pages * PAGESIZE
            mark(bin_base_addr, bin_top_addr)

The original algorithm has some optimizations for proccessing bits in clusters (skips groups of bins without the scan bit) and some kind-of bugs too.

Again, the functions in all_lower_case don't really exist, some pointer arithmetics are done in place for finding those values.

Note that the pools are iterated over and over again until there are no unvisited bins. I guess this is a fair price to pay for not having a mark stack (but I'm not really sure =).

thread_resumeAll()
This is, again, part of the threads runtime and resume all the paused threads by signaling a SIGUSR2 to them.
sweep()
mark_unmarked_free()
rebuild_free_list()
mark_unmarked_free()

This (invented) function looks for unmarked bins and set the freebits bit on them if they are small objects (bin size smaller than B_PAGE) or mark the entire page as free (B_FREE) in case of large objects.

This step is in charge of executing destructors too (through rt_finalize() the runtime function).

rebuild_free_list()

This (also invented) function first clear the free list (bucket) and then rebuild it using the information collected in the previous step.

As usual, only bins with size smaller than B_PAGE are linked to the free list, except if the pages they belong to have all the bins freed, in which case the page is marked with the special B_FREE bin size. The same goes for big objects freed in the previous step.

I think rebuilding the whole free list is not necessary, the new free bins could be just linked to the existing free list. I guess this step exists to help reducing fragmentation, since the rebuilt free list group bins belonging to the same page together.

Understanding the current GC, part IV

by Leandro Lucarella on 2009- 04- 10 18:33 (updated on 2009- 04- 10 18:33)
tagged d, dgc, druntime, en, freeing, gc, mark-sweep, reallocation, understanding the current gc - with 0 comment(s)

What about freeing? Well, is much simpler than allocation =)

GC.free(ptr) is a thread-safe wrapper for GC.freeNoSync(ptr).

GC.freeNoSync(ptr) gets the Pool that ptr belongs to and clear its bits. Then, if ptr points to a small object (bin size smaller than B_PAGE), it simply link that bin to the free list (Gcx.bucket). If ptr is a large object, the number of pages used by the object is calculated then all the pages marked as B_FREE (done by Pool.freePages(start, n_pages)).

Then, there is reallocation, which is a little more twisted than free, but doesn't add much value to the analysis. It does what you think it should (maybe except for a possible bug) using functions already seen in this post or in the previous ones.

Understanding the current GC, part III

by Leandro Lucarella on 2009- 04- 10 02:28 (updated on 2009- 04- 10 02:28)
tagged allocation, d, dgc, druntime, en, gc, mark-sweep, understanding the current gc - with 0 comment(s)

In the previous post we focused on the Gcx object, the core of the GC in druntime (and Phobos and Tango, they are all are based on the same implementation). In this post we will focus on allocation, which a little more complex than it should be in my opinion.

It was not an easy task to follow how allocation works. A GC.malloc() call spawns into this function calls:

GC.malloc(size, bits)
 |
 '---> GC.mallocNoSync(size, bits)
        |
        |---> Gcx.allocPage(bin_size)
        |      |
        |      '---> Pool.allocPages(n_pages)
        |             |
        |             '---> Pool.extendPages(n_pages)
        |                    |
        |                    '---> os_mem_commit(addr, offset, size)
        |
        |---> Gcx.fullcollectshell()
        |
        |---> Gcx.newPool(n_pages)
        |      |
        |      '---> Pool.initialize(n_pages)
        |             |
        |             |---> os_mem_map(mem_size)
        |             |
        |             '---> GCBits.alloc(bits_size)
        |
        '---> Gcx.bigAlloc(size)
               |
               |---> Pool.allocPages(n_pages)
               |      '---> (...)
               |
               |---> Gcx.fullcollectshell()
               |
               |---> Gcx.minimize()
               |      |
               |      '---> Pool.Dtor()
               |             |
               |             |---> os_mem_decommit(addr, offset, size)
               |             |
               |             |---> os_mem_map(addr, size)
               |             |
               |             '---> GCBits.Dtor()
               |
               '---> Gcx.newPool(n_pages)
                      '---> (...)

Doesn't look so simple, ugh?

The map/commit differentiation of Windows doesn't exactly help simplicity. Note that Pool.initialize() maps the memory (reserve the address space) while Pool.allocPages() (through Pool.extendPages()) commit the new memory (ask the OS to actually reserve the virtual memory). I don't know how good is this for Windows (or put in another way, how bad could it be for Windows if all mapped memory gets immediately committed), but it adds a new layer of complexity (that's not even needed in Posix OSs). The whole branch starting at Gcx.allocPage(bin_size) would be gone if this distinction it's not made. Besides this, it worsen Posix OSs performance, because there are some non-trivial lookups to handle this non-existing non-committed pages, even when the os_mem_commit() and os_mem_decommit() functions are NOP and can be optimized out, the lookups are there.

Mental Note

See if getting rid of the commit()/decommit() stuff improves Linux performance.

But well, let's forget about this issue for now and live with it. Here is a summary of what all this functions do.

Note

I recommend to give another read to the (updated) previous posts of this series, specially if you are not familiar with the Pool concept and implementation.

GC.malloc(size, bits)
This is just a wrapper for multi-threaded code, it takes the GCLock if necessary and calls GC.mallocNoSync(size, bits).
GC.mallocNoSync(size, bits)

This function has 2 different algorithms for small objects (less than a page of 4KiB) and another for big objects.

It does some common work for both cases, like logging and adding a sentinel for debugging purposes (if those feature are enabled), finding the bin size (bin_size) that better fits size (and cache the result as an optimization for consecutive calls to malloc with the same size) and setting the bits (NO_SCAN, NO_MOVE, FINALIZE) to the allocated bin.

Small objects (bin_size < B_PAGE)
Looks at the free list (Gcx.bucket) trying to find a page with the minimum bin size that's equals or bigger than size. If it can't succeed, it calls Gcx.allocPage(bin_size) to find room in uncommitted pages. If there still no room for the requested amount of memory, it triggers a collection (Gcx.fullcollectshell()). If there is still no luck, Gcx.newPage(1) is called to ask the OS for more memory. Then it calls again Gcx.allocPage(bin_size) (remember the new memory is just mmap'ped but not commit'ed) and if there is no room in the free list still, an out of memory error is issued.
Big objects (B_PAGE and B_PAGEPLUS)
It simply calls Gcx.bigAlloc(size) and issue an out of memory error if that call fails to get the requested memory.
Gcx.allocPage(bin_size)
This function linearly search the pooltable for a Pool with an allocable page (i.e. a page already mapped by not yet committed). This is done through a call to Pool.allocPages(1). If a page is found, its bin size is set to bin_size via the Pool's pagetable, and all the bins of that page are linked to the free list (Gcx.bucket).
Pool.allocPages(n_pages)
Search for n_pages consecutive free pages (B_FREE) in the committed pages (pages in the pagetable with index up to ncommited). If they're not found, Pool.extendPages(n_pages) is called to commit some more mapped pages to fulfill the request.
Pool.extendPages(n_pages)
Commit n_pages already mapped pages (calling os_mem_commit()), setting them as free (B_FREE) and updating the ncommited attribute. If there are not that many uncommitted pages, it returns an error.
Gcx.newPool(n_pages)
This function adds a new Pool to the pooltable. It first adjusts the n_pages variable using various rules (for example, it duplicates the current allocated memory until 8MiB are allocated and then allocates 8MiB pools always, unless more memory is requested in the first place, of course). Then a new Pool is created with the adjusted n_pages value and it's initialized calling to Pool.initialize(n_pages), the pooltable is resized to fit the new number of pools (npools) and sorted using Pool.opCmp() (which uses the baseAddr to compare). Finally the minAddr and maxAddr attributes are updated.
Pool.initialize(n_pages)
Initializes all the Pool attributes, mapping the requested number of pages (n_pages) using os_mem_map(). All the bit sets (mark, scan, freebits, noscan) are allocated (using GCBits.alloc()) to n_pages * PAGESIZE / 16 bits and the pagetable too, setting all bins to B_UNCOMMITTED and ncommitted to 0.
Gcx.bigAlloc(size)

This is the weirdest function by far. There are very strange things, but I'll try to explain what I understand from it (what I think it's trying to do).

It first make a simple lookup in the pooltable for n_pages consecutive pages in any existing Pool (calling Pool.allocPages(n_pages) as in Gcx.allocPage()). If this fails, it runs a fullcollectshell() (if not disabled) then calls to minimize() (to prevent bloat) and then create a new pool (calling newPool() followed by Pool.allocPages()). If all that fails, it returns an error. If something succeed, the bin size for the first page is set to B_PAGE and the remaining pages are set to B_PAGEPLUS (if any). If there is any unused memory at the end, it's initialized to 0 (to prevent false positives when scanning I guess).

The weird thing about this, is that a lot of lookups into the pooltable are done in certain condition, but I think they are not needed because there are no changes that can make new room.

I don't know if this is legacy code that never got updated and have a lot of useless lookups or if I'm getting something wrong. Help is welcome!

There is not much to say about os_mem_xxx(), Gcx.minimize() and Gcx.fullcollectshell() functions, they were briefly described in the previous posts of this series. Pool.Dtor() just undo what was done in Pool.initialize().

A final word about the free list (Gcx.bucket). It's just a simple linked list. It uses the first size_t bytes of the free bin to point to the next free bin (there's always room for a pointer in a bin because their minimum size is 16 bytes). A simple structure is used to easy this:

struct List {
    List *next;
}

Then, the memory cell is casted to this structure to use the next pointer, like this:

p = gcx.bucket[bin]
gcx.bucket[bin] = (cast(List*) p).next

I really have my doubts if this is even a little less cryptic than:

p = gcx.bucket[bin]
gcx.bucket[bin] = *(cast(void**) p)

But what the hell, this is no really important =)

Understanding the current GC, part II

by Leandro Lucarella on 2009- 04- 05 21:00 (updated on 2009- 04- 15 01:10)
tagged d, dgc, druntime, en, gc, gcx, mark-sweep, understanding the current gc - with 0 comment(s)

Back to the analysis of the current GC implementation, in this post I will focus on the Gcx object structure and methods.

Gcx attributes

Root set
roots (nroots, rootdim)
An array of root pointers.
ranges (nranges, rangedim)
An array of root ranges (a range of memory that should be scanned for root pointers).
Beginning of the stack (stackBottom)
A pointer to the stack bottom (assuming it grows up).
Pool table (pooltable, npools)
An array of pointers to Pool objects (the heap itself).
Free list (bucket)
A free list for each Bins size.
Internal state
anychanges
Set if the marking of a range has actually marked anything (and then using in the full collection.
inited
Set if the GC has been initialized.
Behaviour changing attributes
noStack
Don't scan the stack if activated.
log
Turn on logging if activated.
disabled
Don't run the collector if activated.
Cache (for optimizations and such)
p_cache, size_cache
Querying the size of a heap object is an expensive task. This caches the last query as an optimization.
minAddr, maxAddr
All the heap is in this range. It's used as an optimization when looking if a pointer can be pointing into the heap (if the pointer is not in this range it can be safely discarded, but if it's in the range, a full search in the pooltable should be done).

Gcx main methods

initialize()
Initialization, set the Gcx object attributes to 0, except for the stackBottom (which is set to the address of a dummy local variable, this works because this function is one of the first functions called by the runtime) and the inited flag, that is set to 1. The log is initialized too.
Dtor()
Destruction, free all the memory.
Root set manipulation
addRoot(p), removeRoot(p), rootIter(dg)
Add, remove and iterate over single root pointers.
addRange(pbot, ptop), remove range(pbot), rangeIter(dg)
Add, remove and iterate over root pointer ranges. This methods are almost the same as the previous ones, so the code duplication here can be improved here.
Flags manipulation

Each Bin has some flags associated (as explained before). With this functions the user can manipulate some of them:

  • FINALIZE: this pool has destructors to be called (final flag)
  • NO_SCAN: this pool should not be scanned for pointers (noscan flag)
  • NO_MOVE: this pool shouldn't be moved (not implemented)
getBits(pool, biti)
Get which of the flags specified by biti are set for the pool Pool.
setBits(pool, mask)
Set the flags specified by mask for the pool Pool.
clrBits(pool, mask)
Clear the flags specified by mask for the pool Pool.
Searching
findPool(p)
Find the Pool object that pointer p is in.
findBase(p)
Find the base address of block containing pointer p.
findSize(p)
Find the size of the block pointed by p.
getInfo(p)
Get information on the pointer p. The information is composed of: base (the base address of the block), size (the size of the block) and attr (the flags associated to the block, as shown in Flag manipulation). This information is returned as a structure called the BlkInfo.
findBin(size)
Compute Bins (bin size) for an object of size size.
Heap (pagetable) manipulation

The pooltable is kept sorted always.

reserve(size)
Allocate a new Pool of at least size bytes.
minimize()
Minimizes physical memory usage by returning free pools to the OS.
bigAlloc(size)
Allocate a chunk of memory that is larger than a page.
newPool(npages)
Allocate a new Pool with at least npages pages in it.
allocPage(bin)
Allocate a page of bin size.
Collection
mark(pbot, ptop)

This is the mark phase. It search a range of memory values and mark any pointers into the GC heap. The mark bit is set, and if the noscan bit is unset, the scan bit is activated (indicating that the block should be scanned for pointers, equivalent to coloring the cell grey in the tri-colour abstraction).

The mark phase is not recursive (nor a mark stack is used). Only the passed range is marked, pointers are not followed here.

That's why the anychanges flag is used, if anything has got marked, anychanges is set to true. The marking phase is done iteratively until no more blocks are marked, in which case we can safely assume that we marked all the live blocks.

fullcollectshell()
The purpose of the shell is to ensure all the registers get put on the stack so they'll be scanned.
fullcollect(stackTop)

Collect memory that is not referenced by the program. The algorithm is something like this:

  1. Stop the world (all other threads)
  2. Clear all the mark/scan bits in the pools
  3. Manually mark each free list entry (bucket), so it doesn't get scanned
  4. mark() the static data
  5. mark() stacks and registers for each paused thread
  6. mark() the root set (both roots and ranges)
  7. mark() the heap iteratively until no more changes are detected (anychanges is false)
  8. Start the world (all other threads)
  9. Sweep (free up everything not marked)
  10. Free complete pages, rebuild free list

Note

This is a very summarized version of the algorithm, what I could understand from a quick look into the code, which is pretty much undocumented. A deeper analysis should be done in a following post.

TODO list

by Leandro Lucarella on 2009- 04- 05 03:42 (updated on 2009- 04- 05 03:42)
tagged d, dgc, en, issue tracker, plan, project, todo - with 0 comment(s)

I've activated the issue tracker module in my D Garbage Collector Research project to be able to track my TODO list.

This is probably useful just for me, but maybe you can be interested in knowing what I will do next =)

GC optimization for contiguous pointers to the same page

by Leandro Lucarella on 2009- 04- 01 20:41 (updated on 2009- 04- 01 20:41)
tagged d, dgc, en, gc, optimization, phobos - with 0 comment(s)

This optimization had a patch, written by Vladimir Panteleev, sitting on Bugzilla (issue #1923) for a little more than an year now. It was already included in both Tango (issue #982) and DMD 2.x but DMD 1.x was missing it.

Fortunately is now included in DMD 1.042, released yesterday.

This optimization is best seen when you do word splitting of a big text (as shown in the post that triggered the patch):

import std.file, std.string;
void main() {
    auto txt = cast(string) read("text.txt"); // 6.3 MiB of text
    auto words = txt.split();
}

Now in words we have an array of slices (a contiguous area in memory filled with pointers) about the same size of the original text, as explained by Vladimir.

The GC heap is divided in (4KiB) pages, each page contains cells of a fixed type called bins. There are bin sizes of 16 (B_16) to 4096 (B_PAGE), incrementing in steps of power of 2 (32, 64, etc.). See Understanding the current GC for more details.

For large contiguous objects (like txt in this case) multiple pages are needed, and that pages contains only one bin of size B_PAGEPLUS, indicating that this object is distributed among several pages.

Now, back with the words array, we have a range of about 3 millions interior pointers into the txt contiguous memory (stored in about 1600 pages of bins with size B_PAGEPLUS). So each time the GC needs to mark the heap, it has to follow this 3 millions pointers and find out where is the beginning of that block to see its mark-state (if it's marked or not). Finding the beginning of the block is not that slow, but when you multiply it by 3 millions, it could get a little noticeable. Specially when this is done several times as the dynamic array of words grow and the GC collection is triggered several times, so this is kind of exponential.

The optimization consist in remembering the last page visited if the bin size was B_PAGE or B_PAGEPLUS, so if the current pointer being followed points to the last visited (cached) page, we can skip this lookup (and all the marking indeed, as we know we already visited that page).

Mercurial is not good enough

by Leandro Lucarella on 2009- 03- 31 23:55 (updated on 2009- 03- 31 23:55)
tagged d, dgc, en, fast-export, git, howto, ldc, mercurial - with 0 comment(s)

I started learning some Mercurial for interacting with the LDC repository, but I disliked it instantly. Sure, it's great when you come from SVN, but it's just too limited if you come from GIT (I can't live anymore without git rebase -i).

Fortunately there is fast-export. With it I can incrementally import the Mercurial repository in a GIT repository as easy as:

hg clone http://hg.dsource.org/projects/ldc ldc-hg
mkdir ldc
cd ldc
git init
hg-fast-export.sh -r my_local_hg_repo_clone

I'm very happy to be at home again =)

LDC

by Leandro Lucarella on 2009- 03- 29 15:56 (updated on 2009- 03- 29 15:56)
tagged compiler, d, dgc, en, howto, ldc, llvm - with 0 comment(s)

My original plan was to use GDC as my compiler of choice. This was mainly because DMD is not free and there is a chance that I need to put my hands in the compiler guts.

This was one or two years ago, now the situation has changed a lot. GDC is dead (there was no activity for a long time, and this added to the fact that GCC hacking is hard, it pretty much removes GDC from the scene for me).

OTOH, DMD now provides full source code of the back-end (the front-end was released under the GPL/Artistic licence long ago), but the license is really unclear about what can you do with it. Most of the license mostly tell you how you can never, never, never sue Digital Mars, but about what you can actually do, it's says almost nothing:

The Software is copyrighted and comes with a single user license, and may
not be redistributed. If you wish to obtain a redistribution license,
please contact Digital Mars.

You can't redistribute it, that's for sure. It says nothing about modifications. Anyways, I don't think Walter Bright mind to give me permission to modify it and use it for my personal project, but I prefer to have a software with a better license to work with (and I never was a big fan of Walter's coding either, so =P).

Fortunately there is a new alternative now: LDC. You should know by now that LDC is the DMD front-end code glued to the LLVM back-end, that there is an alpha release (with much of the main functionality finished), that it's completely FLOSS and that it's moving fast and getting better every day (a new release is coming soon too).

I didn't play with LLVM so far, but all I hear about it is that's a nice, easy to learn and work, compiler framework that is widely used, and getting better and better very fast too.

To build LDC just follow the nice instructions (I'm using Debian so I just had to aptitude install cmake cmake-curses-gui llvm-dev libconfig++6-dev mercurial and go directly to the LDC specific part). Now I just have to learn a little about Mercurial (coming from GIT it shouldn't be too hard), and maybe a little about LLVM and I'm good to go.

So LDC is my compiler of choice now. And it should be yours too =)

Collected newsgroup links

by Leandro Lucarella on 2009- 03- 29 01:05 (updated on 2009- 03- 29 01:05)
tagged d, dgc, en, links, wiki - with 0 comment(s)

I've been monitoring and saving interesting (GC related mostly) posts from the D newsgroups. I saved all in a plain text file until today that I decided to add them to a wiki page.

Please feel free to add any missing post that include interesting GC-related discussions.

Thanks!

D GC Benchmark Suite

by Leandro Lucarella on 2009- 03- 28 15:31 (updated on 2009- 03- 28 15:31)
tagged benchmark, d, dgc, en, request - with 0 comment(s)

I'm trying to make a benchmark suite to evaluate different GC implementations.

What I'm looking for is:

Feel free to post trivial test or links to programs projects as comments or via e-mail.

Thanks!

Accurate Garbage Collection in an Uncooperative Environment

by Leandro Lucarella on 2009- 03- 21 17:23 (updated on 2009- 03- 22 00:05)
tagged accurate, d, dgc, en, henderson, paper, tracing, uncooperative environment - with 0 comment(s)

I just read Accurate Garbage Collection in an Uncooperative Environment paper.

Unfortunately this paper try to solve mostly problems D don't see as problems, like portability (targeting languages that emit C code instead of native machine code, like the Mercury language mentioned in the paper). Based on the problem of tracing the C stack in a portable way, it suggests to inject some code to functions to construct a linked list of stack information (which contains local variables information) to be able to trace the stack in an accurate way.

I think none of the ideas presented by this paper are suitable for D, because the GC already can trace the stack in D (in an unportable way, but it can), and it can get the type info from better places too.

In terms of (time) performance, benchmarks shows that is a little worse than Boehm (et al) GC, but they argue that Boehm has years of fine grained optimizations and it's tightly coupled with the underlying architecture while this new approach is almost unoptimized yet and it's completely portable.

The only thing it mentions that could apply to D (and any conservative GC in general) is the issues that compiler optimizations can introduce. But I'm not aware of any of this issues, so I can't say anything about it.

In case you wonder, I've added this paper to my papers playground wiki page =)

Update

I think I missed the point with this paper. Current D GC can't possibly do accurate tracing of the stack, because there is no way to get a type info from there (I was thinking only in the heap, where some degree of accuracy is achieved by setting the noscan bit for a bin that don't have pointers, as mentioned in my previous post).

So this paper could help getting accurate GC into D, but it doesn't seems a great deal when you can add type information about local variables when emitting machine code instead of adding the shadow stack linked list. The only advantage I see is that I think it should be possible to implement the linked list in the front-end.

Understanding the current GC

by Leandro Lucarella on 2009- 01- 04 18:37 (updated on 2009- 04- 09 19:53)
tagged bin, d, dgc, druntime, en, gc, intro, mark-sweep, pool, understanding the current gc - with 1 comment(s)

Oh, yeah! A new year, a new air, and the same thesis =)

After a little break, I'm finally starting to analyze the current D (druntime) GC (basic) implementation in depth.

First I want to say I found the code really, but really, hard to read and follow. Things are split in several parts without apparent reason, which make it really hard to understand and it's pretty much undocumented.

I hope I can fully understand it in some time to be able to make a full rewrite of it (in a first pass, conserving the main design).

Overview

I'll start with a big picture overview, and then I'll try to describe each component with more detail.

The implementation in split in several files:

gcstats.d
I didn't took a look at this one yet, but I guess it's about stats =).
gcbits.d
A custom bitset implementation for collector bit/flags (mark, scan, etc.).
gcalloc.d
A wrapper for memory allocation with several versions (malloc, win32, mmap and valloc). 4 functions are provided: map, unmap, commit and decommit. The (de)commit stuff if because (in Sean Kelly's words) Windows has a 2-phase allocation process. You can reserve the address space via map and unmap, but the virtual memory isn't actually created until you call commit. So decommit gets rid of the virtual memory but retains ownership of the address space.
gcx.d
The real GC implementation, split in 2 main classes/structs: GC and Gcx. GC seems to be a thin wrapper over Gcx that only provides the allocation logic (alloc/realloc/free) and Gcx seems to be the responsible for the real GC work (and holding the memory).
gc.d
This is just a thin wrapper over gcx.d to adapt it to the druntime GC interface.

The Gcx struct is where most magic happens. It holds the GC memory organized in pools. It holds the information about roots, the stack and free list, but in this post I'll focus in the memory pools:

Pool Concept

A pool is a group of pages, each page has a bin size (Bins) and host a fixed number of bins (PAGESIZE / Bins, for example, if Bins == 1024 and PAGESIZE == 4096, the page holds 4 bins).

Each bin has some bits of information:

mark
Setted when the Bin is visited by the mark phase.
scan
Setted when the Bin is has been visited by the mark phase (the mark bit is set) but it has pointers yet to be scanned.
free
Setted when the Bin is free (linked to a free list).
final
The object stored in this bin has a destructor that must be called when freed.
noscan
This bin should be not scanned by the collector (it has no pointers).
+----------------------------------------+-----+-----------------+
| Page 0 (bin size: Bins)                | ... | Page (npages-1) |
|                                        |     |                 |
| +--------+-----+---------------------+ |     |                 |
| | Bin 0  | ... | Bin (PAGESIZE/Bins) | |     |                 |
| +--------+-----+---------------------+ |     |                 |
| | mark   | ... |                     | |     |                 |
| | scan   | ... |                     | |     |       ...       |
| | free   | ... |         ...         | |     |                 |
| | final  | ... |                     | |     |                 |
| | noscan | ... |                     | |     |                 |
| +--------+-----+---------------------+ |     |                 |
+----------------------------------------+-----+-----------------+

Pool Implementation

A single chunk of memory is allocated for the whole pool, the baseAddr points to the start of the chunk, the topAddr, to the end. A pagetable holds the bin size (Bins) of each page

.          ,-- baseAddr                                   topAddr --,
           |                   ncommitted = i                       |
           |                                                        |
           |--- committed pages ---,------ uncommitted pages -------|
           V                       |                                V
           +--------+--------+-----+--------+-----+-----------------+
    memory | Page 0 | Page 1 | ... | Page i | ... | Page (npages-1) |
           +--------+--------+-----+--------+-----+-----------------+
               /\       /\      /\     /\      /\          /\
               ||       ||      ||     ||      ||          ||
           +--------+--------+-----+--------+-----+-----------------+
 pagetable | Bins 0 | Bins 1 | ... | Bins i | ... | Bins (npages-1) |
(bin size) +--------+--------+-----+--------+-----+-----------------+

The bin size can be one of:

B_XXX
The XXX is a power of 2 from 16 to 4096. The special name B_PAGE is used for the size 4096.
B_PAGEPLUS
The whole page is a continuation of a large object (the first page of a large object has size B_PAGE).
B_FREE
The page is completely free.
B_UNCOMMITED
The page is not committed yet.
B_MAX
Not really a value, used for iteration or allocation. Pages can't have this value.

The information bits are stored in a custom bit set (GCBits). npages * PAGESIZE / 16 bits are allocated (since the smallest bin is 16 bytes long) and each bit is addressed using this formula:

bit(pointer) = (pointer - baseAddr) / 16

This means that a bit is reserved each 16 bytes. For large bin sizes, a lot of bits are wasted.

The minimum pool size is 256 pages. With 4096 bytes pages, that is 1 MiB.

The GCBits implementation deserves another post, it's a little complex and I still don't understand why.