An end to class renaming

by Colin Adams (modified: 2007 Jul 21)

See http://colina.demon.co.uk/?q=node/31

Comments

Bernd Schoeller (16 years ago 22/7/2007)
Nice idea

I really think it is a nice idea. Though I am not a Java person, I think they definitely did get many things right. One of these things (and I think they really were the first ones and deserve the credit) was to introduce a global naming system for classes and libraries.

The approach of Eiffel of using a "flat structure" with "manual name-clash resolution" is fine for features within classes, as everything here is controlled by a single person or group. But when building libraries or applications from other libraries, it is difficult to keep an overview. Here, it is necessary that classes do have a unique, global identifier that is the same in every context.

The only criticism to your suggestion Colin is the use of URLs: I think URLs are too heavy. Why would we need to specify a transmission protocol like 'http' or 'ftp' when all we want to do is to prevent name clashes? Just using the reverse DNS hierarchy like it is done in Java is - in my humble opinion - much nicer and similarly powerful.
- Colin Adams (16 years ago 22/7/2007)
  No transmission protocols
  
  http or ftp is not a protocol within a URL; it is a URI scheme name. URIs are just opaque strings for namespaces.
  
  Also I said URIs, not necessarily URLs. Although I admit that an http URL is the most likely URI scheme to be used. But mailto scheme URIs are also a plausible candidate.
  
  Also URIs have the advantage over Java's reverse DNS hierarchy in that you can structure your URI space for multiple libraries. If you only have one DNS address, you can't do this.
  
  And if you don't have a DNS address at all (many people don't - probably the majority of users), then the mailto scheme is perfect.
  
  Colin Adams
Manu (16 years ago 23/7/2007)
If you have two libraries a

If you have two libraries a and b which both have a class called MY_CLASS. And library a uses b a lot. With a URI mechanism, you will end up having to qualify MY_CLASS from b all the time whenever you want to use it in a. As Bernd pointing out, this is quite heavy.

You seem to reject renaming, but for those situation renaming is definitely the best thing you can do.

In any case, your URI scheme and the way EiffelStudio does it, are exactly the same on the principle although in the form they are slightly different.
- Colin Adams (16 years ago 23/7/2007)
  Rejecting renaming
  
  Yes, I am rejecting renaming. It is highly confusing.
  
  If library a is using MY_CLASS from library b a lot, then why does it have a MY_CLASS of its own? It seems a strange decision to make, but if you are going to do that, then you certainly should qualify the use of b's MY_CLASS. To do otherwise is just confusing.
  
  And Bernd wasn't saying that qualifying was heavy - he was saying that using a URI rather than a reverse DNS name was heavy. But I don't agree that that is the case, and I gave what I thought were very good reasons why a URI is more appropriate.
  
  Colin Adams
  - Manu (16 years ago 24/7/2007)
    Short vs. Long
    
    If you end up seeing com.eiffel.www.library.a_library.CLASS all over the place it is very verbose and heavy. As the library author of a you know what MY_CLASS stands for. In a way it is not renaming, it is a `as known as' relation.
    - Colin Adams (16 years ago 24/7/2007)
      Library design
      
      If you are the author of library b (not a), then you know if you are going to use it a lot. If you are, then don't use the same name as a depending library.
      
      And you got the name wrong - that is not a namespace. Colin Adams
      - Eric Bezault (16 years ago 24/7/2007)
        multiple library dependency
        
        There is something I still don't understand. OK, I understand that if library A uses library B, I better not name classes in A with the same name as classes in B. But now if library Z uses libraries X and Y, and X and Y (which are independent libraries) both have a class with the same name FOO. Then if in library Z I need to use class FOO from X and class FOO from Y a lot. What is the solution if I don't want my code in library Z to have long class names all over the place? Or if there is no solution, one may wonder how often this situation really occurs in practice.
        
        Colin Adams (16 years ago 25/7/2007)
        Prefixes?
        
        If you need to use both classes, but only one of them a lot, then you can set the configuration search order to look first in the library where you will use the class the most.
        
        But this has two problems - you might want to use both versions a lot, or you might want to use one class from X a lot, but another from Y a lot.
        
        So another possibility is to use prefixes to represent the namespaces, like XML does (for a different reason).
        
        But I see Martin has just made a long comment, so I need to read that first.
        
        Colin Adams
Martin Seiler (16 years ago 25/7/2007)
name clashes

First we do not need a global unique namespace. We already have the UUID for this purpose which does not bother the user.

If you use two libraries FRENCH_PEOPLE and SPANISH_PEOPLE which contain different classes with the same name: Say PERSON. Then you should define for yourself a convention for the renaming.

I suggest you chose a name to reference each library: FRANCE and SPAIN.

Now you rename PERSON of the FRENCH_PEOPLE library into PERSON_OF_FRANCE. Analog you rename PERSON of the SPANISH_PEOPLE library into PERSON_OF_SPAIN.

I tend to see very verbose and exact class names in Eiffel. Adding such a suffix is not really a big deal.

Now if we had some kind of DNS based namespaces, it would be a lot more verbose and would not look very Eiffel like.

Compare this: local l_renamed_french_guy: PERSON_OF_FRANCE l_nmspced_french_guy: http://frenchpeople.origo.ethz.ch.PERSON do

Ok, but maybe we really want namespaces? Well, why not... after all there is always room for change if a lot of people want a certain feature, right?

So my suggestion would be the following: When you _reference_ a library in your config file you specify a locally unique, nice, neat, and small name for that library. So now I suggest to do something like this: local l_french_person: PERSON of france l_spanish_person: PERSON of spain do

This looks much better in my opinion. If you like my suggestion and we don't find serious issues this would be very easy to add to ECF. Until then, you could just adopt a convention and use renaming in conflicting cases:

local l_french_person: PERSON_of_FRANCE l_spanish_person: PERSON_of_SPAIN do (Please do not complain if I wrote something in capital letters or not where you don't like it.)

I think given the fact that everybody here can name all important open source libraries for Eiffel by heart and under the assumption that we do not produce stupid name clashes we can say that just by using renaming and a good convention you have a working solution. And there is room to add syntax to support for this in the language directly. Then one can possibly also chose namespaces on a per class level by using something like: note: use FRANCE And in case of name clashes the preference is given to classes of FRANCE. -- mTn-_-|
- Colin Adams (16 years ago 25/7/2007)
  I like the "of name" syntax
  
  Well, UUIDs bother me a lot. I think namespaces are much better both in terms of readability and accountability.
  
  But I like your "of some-name" syntax a lot. Much better than the suggestion I just made of using XML-style prefixes.
  
  However I have some problems with the way you suggest to bind the "some-name"s. When I write a configuration file for my library, I want to do it once and for all. And I can't do that if I have to code path-names in.
  
  So again I come back to a namespace. This gives you a once-and-for-all-redistributable binding in a library configuration file. Then in the system configuration file you just have to give hints as to where to find the library for a given namespace (I say hints, as perhaps there could be more flexible ways than just coding a fixed path).
  
  Colin Adams
  - Martin Seiler (16 years ago 27/7/2007)
    Two different issues
    
    Colin about referencing libraries:
    > And I can't do that if I have to code path-names in.
    
    True. It is a different problem though. Currently this is solved by using environment variables in your path. But we might very well think about it too:
    
    The following links talk about potential ways to extend ECF:
    
    http://dev.eiffel.com/Ace_To_Ecf:_Improving_On_The_Existing#Future_of_ECF>
    
    http://dev.eiffel.com/ProposalLibraryDependencies
    
    http://dev.eiffel.com/ProposalConfigurationDiscovery
    
    The dependencies solution uses reversed domain names to reference a library in a unique way.The configuration discovery uses a URL to specify the library location.
    
    But what if URLs change? What if your library is first hosted on sourceforge? Then on origo?
    
    Maybe we are better off with
    
    simple names used to reference a library: base, gobo, eiffelmedia (they can also be more complex if needed: ethz.eiffelmedia, ise_base, gobo-structures)
    
    You can alias your libraries on a local basis (just used in the source code: "em" instead of "eiffelmedia")
    
    The compiler knows a set of locations to look for library config files.
    
    Master servers are there to map names of missing library versions to locations (homepage, mirrors, ...)
    
    Our compilers come with a list of "master servers" (no single point of failure) which map the library names to locations by using XML-RPC or whatever. This can possibly be done by an http://origo.ethz.ch origo instance for example ensuring secure and authorized access at the same time.
    
    But in the end, I have never experienced a class name conflict so far. I can handle environment variables, but beginners (especially on windows) usually have problems getting everything configured properly. So I would give this issue higher priority than class name conflict resolution.
    
    -- mTn-_-|
    - Colin Adams (16 years ago 27/7/2007)
      Not just ECF
      
      Both my original suggestion and your suggestion of using "CLASS of library" require an emendation to the syntax of Eiffel, so it is not just ECF that we are talking about. So it means going to ECMA to agree on something. Unless you are proposing to make ECF a part of the ECMA standard (and I can think of one person at least who would resist that strongly), the solution cannot rely on ECF mechanisms.
      
      As for discovery of libraries, again a URI could be useful, and is a natural answer to the problem.
      
      Now the point you make that where a project is hosted can change is sound, but it doesn't mitigate against a fixed URI. Apart from such devices as PURLs (Permanent URLs), there is standard a mechanism for URI resolution, and it is called the OASIS ERTC XML Catalog (ETRC stands for Entity Resolution Technical Committee). The input to the resolution process is either a URI or a public identifier (such as a Formal Public Identifier). The output is a URI.
      
      So you have a URI that formally identifies your library. Now a URI can, and probably will, have meaning to the human reader. This is a huge advantage over a UUID. (Do they have any advantages? I can't think of any.). You then look it up in the catalog file (and we have code already in Gobo to do this), and get a URI out that points to the actual library (so I guess it will be a URL, but it might not be one that involves network access - it could be a file URI for instance, for caching a local copy). Note that this solves the versioning problem (you just change the target URI in the catalog for a new version).
      
      You are very lucky that you haven't encountered name clashes yet.
      
      P.S. For URI in all places, read URI or IRI, as a future extension.
      
      Colin Adams
Denis Egoroff (16 years ago 2/10/2007)
Classification issue

initially namespaces in Java or C# , for instance, behave like simple filesystem. however, the matter is not only about how to put class files in different directories of some abstract filesystem (whether distribute or local) and avoid name clashes, but also to provide some classification mechanism (because as I thik "filesystem or namespace model" is not enough). E. g. class MATH_VECTOR[G -> ABSTRACT_NUMBER] can be put both in "mylib.container" and "mylib.number" directories (because it implements +, -, scalar product etc and because it can store some abstract numbers). So I think language environment should(or may) offer the possibility of "multiple classification namespaces". Perhaps you can find the arguments against this idea - but if you do - you should also dislike multiple inheritance which is basically the same (but at the level of classes).

Something like that:

prefix
```
 mylib.container, -- if dot-notation is confusing with class-feature notation
 mylib.number     -- it's possible to use some other separator / or # :-)
```
class MATH_VECTOR ....

After that definition class MATH_VECTOR can be referred both as mylib.container.MATH_VECTOR and
```
mylib.number.MATH_VECTOR.
```
Full name of class (or one of its fullnames) should be used only when it causes a clash (not like in Java - where you should always import classes).

Something liket hat:

use
```
 libOfClashes.subClash.CLASH as PETIT_CLASH
```
class CLASH inherit PETIT_CLASH ...

Naturally, the folowing form can be also possible (if the exported name is used once): class CLASH inherit libOfClashes.subClash.CLASH

How to map this abstract "multiple classsification" filesystem to the real one - is the question of implementation (but I was concerned about conceptual issues , perhaps, not very essential)