Older blog entries for mdupont (starting at number 20)

GNU FUD : http://gcc.gnu.org/ml/gcc/2003-08/msg01136.html

"""" The tree and rtl dumps are intended as debugging aids only. There is no need for them to be complete, they only need to include info we need for debugging. Since we never try to process them, we would never notice if they were incomplete.

As for completing them, that is a potential problem. An FSF policy, intended to prevent people from subverting the GPL, prevents us from emitting debug/intermediate files that could be used by others to use proprietary code with gcc without linking to gcc. This is an inconvenience, but it is current FSF policy so we must respect it. """"

What a primitive and unproductive policy.

What ever happened to xml term?



Why cannot we replace printf with something that emits RDF or a semantically marked up stream of data that identifies the variable printed, the format used, the datatype and the context of that emission?

Then it would be really easy to make really nice xmlterms and other funky things.

20 Aug 2003 (updated 20 Aug 2003 at 14:48 UTC) »

I have been thinking about the idea of the introspector and the relevance of rdf.

If we were to see the entire set of compiler toolchain as a set of rdf consumers and producers, what would be the result?

Each program would read in rdf and emit rdf. Each algorithm will add in new predicates into the soup.

For example : The graph layout tool would add in x y positions.

But what about context? How can we represent the partitioning of rdf data?

What about the idea of multiple view on the same base data? A View would be a context. When a object is viewed it occurs in that context with a viewed predicate.

more to come

20 Aug 2003 (updated 20 Aug 2003 at 11:26 UTC) »

Sent to the GCC List

--- Joe Buck <jbuck@synopsys.com> wrote: > On Tue, Aug 19, 2003 at 12:09:13AM -0700, James Michael DuPont wrote:

> > Dear all,

> > Ashley winters and I have produced what is the first prototype for

> a

> > gcc ontology. It describes the facts extracted

from the gcc using > the

> > introspector. >

> Merriam-Webster defines ontology as follows: > > 1 : a branch of metaphysics concerned with the nature and relations > of being

> 2 : a particular theory about the nature of being or the kinds of > existents

> > I don't think that this is the right term for a piece of code that > produces XML output from a high-level language input.

> Your right. The ontology here is a description of the gcc tree nodes in a very high level that allows you to *understand* the RDF/XML output .

The code that produces this data that matches this onto. is in my cvs, very boring stuff.

I think that this onto. is interesting because it allows you to express the high level semantics of the tree structures in a somewhat implement ion independent manner.

When this is all done and %100 tested we should be able to generate sets of c functions to process data from that ont, databases to store it, and other things like perl and java classes to process the data as well.

n3 coupled with CWM and Euler makes a logical programming language like prolog, you can express data schemas but also proofs, filters and algorithms in n3 /RDF format as well.

I hope that the proofs expressed in CWM and Euler can be translated automatically into new c functions of the compiler in the very very long term.

In any case this ont is meant to be human readable and editable, even if not very pretty. Later on in a lowel level ont. it will contain mapping to the exact gcc structures and functions that implement these ASTS.

In any case this ONT should be of interest and value to anyone wanting to study the gcc ASTS, not just someone who wants to deal with any external represention.

The proofs expressed in n3 should be executable directly on the gcc data structures in memory without any direct external represention when we are able to map out all the data structures and generate binding code.

Then users will be able to write custom filter, algorithms and rules that run inside the gcc for them on their own programs.

19 Aug 2003 (updated 19 Aug 2003 at 14:01 UTC) »

Regarding this file : http://introspector.sourceforge.net/2003/08/16/introspector.n3

Basically this is a high level class model for the GCC internal tree structures as used by the c and (not complete C++) compiler.

The file are based on the OWL[1] vocabulary, which is an RDF[2] application that allows the syntax to be described in RDF/XML[3], n3[4] or ntriples[5] format.

""""The Web Ontology Language OWL is a semantic markup language for publishing and sharing ontologies on the World Wide Web. OWL is developed as a vocabulary extension of RDF (the Resource Description Framework) and is derived from the DAML+OIL Web Ontology Language. """"

This file is describing the data extracted by the introspector [0] from the gcc. The format of the file is closly related to the -fdump-translation-units format, but more usable. I patched the gcc using the Redland RDF Application framework [8] to serialize these tree dump statements into RDF statements using the berkley db backend for fast storage.

The DB is then available for querying using C/C++, JAVA, PERL, Python, and many other interfaces via the Redland Swig interface. Even more you can filter out interesting statements into RDF/XML format for interchanging with other tools.

You can find an example file extracted from the source code of internals of the pnet runtime engine here [9].

The ontology file is basically a powerful class model, you can use many tools to edit and view them, (which i have not tried most of them) TWO of them are the rdfviz tool and owl validator[10]

I used the Closed World Machine [6] from Tim Berners-Lee to process and check this file, that tool along with the EulerSharp[7] that I am working on will allow you to run queries, filters and proof over the data extracted from the gcc.

Futher still, my intent is to embedded a small version of the Euler machine into the gcc and dotgnu/pnet to allow proofs to be made at compile time.


[0] Introspector - introspector.sf.net

[1] OWL - http://www.w3.org/TR/owl-ref/

[2] RDF - http://www.w3.org/RDF/

[3] RDF/XML http://www.w3.org/TR/rdf-syntax-grammar/

[4] n3 http://www.w3.org/2000/10/swap/Primer

[5] ntriples http://www.w3.org/2001/sw/RDFCore/ntriples/

[6] CWM from timbl http://www.w3.org/2000/10/swap/doc/cwm.html

[7] Eulersharp http://eulersharp.sourceforge.net/2003/03swap/

[8] Redland http://www.redland.opensource.ac.uk/

[9] Example n3 file http://demo.dotgnu.org/~mdupont/introspector/cwm.rdf.gz

[10] RDFVIZ and validator http://www.ilrt.bristol.ac.uk/discovery/rdf-dev/rudolf/rdfviz/


I have poste the first version of the introspector ontology here http://introspector.sourceforge.net/2003/08/16/introspector.n3

please review

Here is a nice article on metadata


Today is a sad day, I was banned from the dotgnu list and project. I feel that this was done unfairly.

1. I have contributed more bug reports than any other person to the pnet project.

2. RhysW said that I have made no contribution to the project, dispite my good reports.

3. I have often expressed critique of the various problems that I see, and also present wild ideas that may have bothered many people.

I think that i am being discriminated against unfairly by the admins of the dotgnu project and rhysw specifically.

That Is why I am going to make a formal complaint of discrimination to the GNU project.

more to come


The argument of the DRM proponents is that it is not possible to protect their content without taking away the rights of the students. That is why I have sought to design a solution for content distribution based on free software and open standards that still protects the content from illegal distribution.

I seek with this proposal to address these issues in the context of free software without violating the rights of the students.

Lets say that we have some content that an author worked hard on, and it should be distributed to people who decide that paying a reasonable fee.

Now the one issue is that even if the users should have the right to examine the source code of the software, we still need a way to prevent them from extracting the content out of that software.

If you allow the user to modify the viewing software as to create an human readable and machine processable of the content instead of displaying it, then you are opening up the content for further duplication. Now we are precluding screen shots and OCR software here. Lets say that you want to deliver a rastrasterizedy of the content to the user at an agreed upon resolution. Vector graphics would again allow too much export control.

So we have an agreement between a content provider and a content consumer for a delivery of a certain amount of content that meets a certain level of quality to a viewer that limits the users rights in a predefined manner.

Now, the viewer cannot store the content in a internal data format that is readable by an debugger, because it would be too easy to snarf that data out.

So, I think we can solve this problem very simply : You need to trust that the user will only use an agreed upon version of the viewer software. This software can be free software, and the full source code may be made available, but the content provider does not agree to provide the content to any but an specified and verified set of modules to the user.

So I proposed the following architecture :

1. The users are to be validated by a chip-card system, each user must have a way to authenticate their identity using a card issued by the content provider or a certificate authority. Simple PGP PGP SSH certificate can also be agreed here.

2. The users agree to have a free software client module installed that is of a specified version. This software is able to make a network connection to the content provider and send a digitally signed and encrypted signature of itself to the content provider by a secure channel. This creates a secure session that can only be understood by the client module. The user agrees that he does not have the right to intercept this content which uses open and free software that he can inspect on his leisure. The session however is only good for one set of package, because the user might swap out the software once the session is set up. Hardware based checksumming might help speed up this signature process. BSD has such a software signature built in as well. The user agrees to allow the server to re-check/audit the validity of the client software on its leisure on a predefined interval,that way the server administrator and users can agree on a set of security levels that are appropriate for the given application performance requirements.

3. The user uses this session to request content that is sent securely to him/her. The content is encrypted with an agreed upon encryption standard that will prevent the user from viewing the content. Only the client software session, given an authentication token from the provider and from the client will be able to for one time be able to decode the content. The software then deletes that content according to the agreed procedure.

4. The user can then view the rastrasterizedge. That image could also be water-marked and Id-ed. The agreement between the content provider and the user may define various rules preventing the removal of the various security water-marks. Of course the user can take that one raster and distribute it illegally. There is nothing that any of the DRM DRM do to prevent that.

You see, this is a consent based security system that requires no freedoms are removed from the user. The content provider reserves the right to refuse delivery of content to any other version of the software, the client however has the freedom to modify this software and submit it to content providers for certification.

I think such an consent based content management is much saner than using non-free file formats and non-free software.

What do you think?

13 Aug 2003 (updated 13 Aug 2003 at 16:19 UTC) »

here is small html form for hacking your relationships to projects.

<html> <head> <title>RelationshipEditor</title> </head> <body> <blockquote><br> <form method="post" action="http://www.advogato.org/proj/relsub.html"> <input name="name" value="CVS"> <p> Type of relationship: <br> <select name="type"> <option>None </option> <option selected="selected">Helper </option> <option>Documenter </option> <option>Contributor </option> <option>Developer </option> <option>Lead Developer </option> <option>Victim </option> <option>User </option> <option>Interested Party </option> <option>Advocate </option> <option>Groupie </option> <option>Troll </option> <option>Competitor </option> </select> <input type="submit" value="Update"> </form> </blockquote> </body> </html>

11 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!