This site is now 100% read-only, and retired.

A simple introduction to Debian package tags

Posted by Steve on Fri 22 Jul 2005 at 10:29

One of the new features being introduced into Debian's unstable distribution currently is a "tag" implementation. This allows small pieces of meta-data to be associated with each package in the archive, this data can be useful for searching, and finding new packages.

The historical way that the Debian archive has been managed has been to split it up into sections. There are a small number of sections available and each package belongs to one, and only one, section.

For example a game would go into the Games section, and a Perl library would go in the Perl section. You can see each of the sections, and a brief description here on the Debian website.

Whilst the sections allow a simple and efficient way of categorising software the system suffers from two main flaws:

  • The sections are too coarse; not very fine-grained.
  • A package can only belong to one section.

As a result of this various people have proposed expanding the number of available sections at different times. Another more flexible and open-ended solution has also been proposed several time; adding "tags" to packages to allow them to be described and categorised more fairly.

The tags system is now live in Debian's unstable distribution (codenamed Sid) and should make it into the Etch release.

I first noticed this by accident when viewing the description of a package with apt-cache. If you view, for example, the description of the tidy package you will see the tag information at the bottom:

skx@mystery:~$ apt-cache show tidy
Package: tidy
Priority: optional
Section: web
Installed-Size: 40
Maintainer: Jason Thomas 
Architecture: i386
Version: 20050415-1
Depends: libc6 (>= 2.3.2.ds1-21), libtidy0
Suggests: tidy-doc
Filename: pool/main/t/tidy/tidy_20050415-1_i386.deb
Size: 17020
MD5sum: 983571c271b64f93b01903f56479a70d
Description: HTML syntax checker and reformatter
 Corrects markup in a way compliant with the latest standards, and
 optimal for the popular browsers.  It has a comprehensive knowledge
 of the attributes defined in the HTML 4.0 recommendation from W3C,
 and understands the US ASCII, ISO Latin-1, UTF-8 and the ISO 2022
 family of 7-bit encodings.  In the output:
 .
  * HTML entity names for characters are used when appropriate.
  * Missing attribute quotes are added, and mismatched quotes found.
  * Tags lacking a terminating '>' are spotted.
  * Proprietary elements are recognized and reported as such.
  * The page is reformatted, from a choice of indentation styles.
 .
 Tidy is a product of the World Wide Web Consortium.
Tag: interface::commandline, use::checking, role::sw-utility, format::html, devel

As you can see the last line of the output includes various tags - giving some details about how it is used "interface::commandline", etc.

This information isn't contained in the Debian package itself, but instead it is contained inside the Packages file.

When you run "apt-get update", or "aptitude update" you connect to a number of repositories and download files which contain details about all the packages held on that repository, including their size, their description, etc, this information can be used to search for a package. Now this file also includes tag information.

The package lists are stored in the directory /var/lib/apt/lists, and are simple text files - You can examine them yourself if you wish to see the various "Tag:" entries.

If you wish you can now search for packages using the tags instead of any keywords which might be located inside the package description.

To do that you will need to install two new tools:

  • debtags - Commandline interface to libdebtags functions and Debtags administration tool
  • debtags-edit - GUI application to search and tag packages

Installing both packages can be accomplished via apt-get:

apt-get install debtags debtags-edit

(Or "aptitude install debtags debtags-edit" - if you prefer aptitude.)

Once the debtags package has been installed you can conduct queries against the tags. Such as finding packages related to others.

For example you might be interested in seeing which package is related to bash:

skx@mystery:~$ debtags related bash
bash3 - The GNU Bourne Again SHell (Version 3)

You can also search for packages which are related to IMAP mail:

skx@mystery:~$ debtags grep mail::imap
mutt: application, interface::text-mode, made-of::lang-c, mail::imap, mail::pop, protocol::imap, protocol::ipv6, protocol::pop, role::sw-client, uitoolkit::ncurses, works-with::mail
nail: interface::commandline, interface::shell, mail::imap, mail::list, mail::pop, mail::smtp, protocol::imap, protocol::pop, protocol::smtp, role::sw-client, special::completely-tagged, use::transmission, works-with::mail
cyrus21-imapd: interface::daemon, mail::filters, mail::imap, network::service, protocol::imap, protocol::ipv6, role::sw-server, works-with::mail
imapproxy: interface::daemon, mail::imap, protocol::imap, use::proxying
squirrelmail: interface::web, made-of::lang-php, mail::imap, protocol::imap, works-with::mail
getmail4: mail::imap, mail::pop, protocol::imap, protocol::pop, protocol::ssl

How did I know that mail::imap was the tag used for describing mail and IMAP ? That was the result of a "tagsearch":

skx@mystery:~$ debtags tagsearch mail
mail::TODO - Need an extra tag
mail::filters - Filters
mail::imap - Mail access via IMAP
mail::list - Mailing Lists
mail::notification - Notification
mail::pop - Mail access via POP3
mail::smtp - Mail transfer via SMTP
media::mail - Email
protocol::pop - Mail access via POP3
protocol::smtp - SMTP Simple Mail Transport Protocol
works-with::mail - Email

There are several other options, perhaps the best way to learn more is to read the manpage by running "man debtags".

For much more detailed information please consult:

 

 


Re: A simple introduction to Debian package tags
Posted by yaarg (129.215.xx.xx) on Fri 22 Jul 2005 at 10:58
[ View Weblogs ]
Ahaha, I thought that was new. Good idea indeed.

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by shufla (83.30.xx.xx) on Fri 22 Jul 2005 at 11:02
Hi,

I just can't untill they implement deb database in SQL - is so close ;) But anyway that's great idea - consolidation and signs of Debian maturation.

Is it influenced by new Debian Project Leader, that I see more movement in Debian through Debian developers?

While I was waiting for sarge Debain development was, how to say, unaimed. Right now there are many movements in many directions, but I feel that they are well "leaded". Keep good work :) Etch will rock ;)

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by Steve (82.41.xx.xx) on Fri 22 Jul 2005 at 11:06
[ View Weblogs ]

I don't think that many of the changes have come from the top, Debian really doesn't work like that.

The people implementing all the new ideas are really doing it of their own volition. Things like the tag support have been happening in the background for months before going live.

Whilst it's true that Branden is organising some teams for various purposes he's not directing technical direction - more making sure the projects infrastructure is OK.

Steve
-- Steve.org.uk

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by shufla (83.30.xx.xx) on Fri 22 Jul 2005 at 11:17
Well, I wouldn't remove credit from people creating and implementing cool, needed stuff. But post-woody - pre-sarge days in Debian made me feel, that there was no aim. Very good, advenced and cool stuff was made, but wasn't finished (== wasn't implemented in stable repository, wasn't supported in Debian way). In Poland there's PLD http://www.pld-linux.org/, which contains many highly skilled hackers, who are addicted to thier system (I was close to them, but not enough to say "participate", rather little work ;) ) - but they are not leaded, many problem arises when new stuff is implemented, and without proper 'commetee', which is able to lower personal and/or technical diffrence they aren't able to release stable distro for many years (PLD 1.0 rocks in it times, 2.0 isn't realeased still, and 3.0 is now in active development). That's a pity, when so much energy is spoiled and wasted in such way.

What I see know, is that Debian community have thier destination, and they are selfrigorous, so waste is lowered.

Luke

PS. Ah! LANG=en_rubbish, but I hope I'm understood ;)

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by Steve (82.41.xx.xx) on Fri 22 Jul 2005 at 11:21
[ View Weblogs ]

I think a lot of the slowdown is due to the sheer size of the distribution. Although pre-Sarge there were a lot of things that were just too late in the day, that couldn't be introduced without postponing the release even further.

It's very difficult to gain a consensus amongst all the developers in any single topic - so a lot of people give up too soon, or have to practically implement a solution fully before people can see how useful / difficult it is.

In small groups a simple idea can be discussed and agreed upon by all present. But this isn't often something that can work for a big project like Debian.

Still it's great to see new, useful, and interesting developments "make it".

(Your language is just fine, don't worry!)

Steve
-- Steve.org.uk

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by undefined (192.91.xx.xx) on Fri 22 Jul 2005 at 17:36
i will reaffirm this: the fruit we are seeing now, is not from branden's labor (yet).

stuff like this requires infrastructure changes. that requires a lot of time to design and plan. it also requires proper timing. until just recently with the release of sarge, everybody was holding off on any major developments because a release was perpetually pending "any day now".

now that sarge is release, we are seeing major developments being implemented: debtags, x.org, openoffice.org 2, etc.

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by AndrewBlack (81.2.xx.xx) on Sat 23 Jul 2005 at 09:10
Is there any way of using tags if you are on Sarge. Eg can you query on the web to find names of packages that are interesting. find the names of these packages and then install the Sarge versions of them.

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by Steve (82.41.xx.xx) on Sat 23 Jul 2005 at 17:26
[ View Weblogs ]

If you're on Sarge you're out of luck.

The sarge package files will not contain the information, and the software tools in Sarge wouldn't know what to do with it even if the information was available.

Querying on the web browsable tag catagories is possible though - it is linked to from the DebTags homepage included in the article.

(There's no reason why somebody with access to the Sid packages files couldn't setup another simple online CGI script if that implementation goes away, or to allow different searching/browsing possabilities. The tags are very simple to work with).

Steve
-- Steve.org.uk

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by Anonymous (81.174.xx.xx) on Sun 24 Jul 2005 at 10:23
With sarge you can install the debtags package and run 'debtags update'. This will download tag data and keep them in /var/lib/debtags, to be accessed by debtags and debtags-edit.

They will however be kept separated, and you won't see them when doing apt-cache show.

--Enrico

[ Parent ]

Re: A simple introduction to Debian package tags
Posted by Anonymous (80.216.xx.xx) on Sat 30 Jul 2005 at 23:33
I guess you are the same Enrico that's on tape talking about Deb Tags at debconf5 in Helsinki. It was a really enjoyable talk very informative and funny. I'm not sure debtags is the way to go, since it seems to much like dmoz and to little like google. But you examples why fretext search (ala google) doesn't work was funny.. ;-) For your viewing pleasure: Video Slides

[ Parent ]