This site is now 100% read-only, and retired.

Help: Backporting nscd to sarge or upgrading glibc?

Posted by lollipop on Mon 24 Apr 2006 at 09:23

Our current server setup is composed of 25 or so servers running Debian sarge. I use openldap for managing authentication and userinfo. Everything works quite well when the LDAP server is up and running, however whenever it goes down, havoc ensues across all our servers.

I assumed that nscd (Name Service Caching Daemon) would cache the important information allowing our servers to continue to function during a small ldap outage. However, nscd on my Sarge servers was not caching any data.

After some investigation with strace I discovered that the resolving library was looking for the nscd socket at /var/run/nscd/socket. In Sarge nscd creates the socket file in /var/run/.nscd_socket, there does not seem to be a way to tell the daemon where to create the socket. This problem is fixed in unstable, but as a work around for Sarge I just added a symlink to the real nscd socket.

So now 'nscd --statistic' was showing that data was indeed being cached and applications were successfully querying nscd. Unfortunately, running 'lsof -i @:ldap' on my web machines still showed connections to our ldap server from the apache process.

This was due to my nsswitch.conf setup:

passwd:         files ldap
group:          files ldap
shadow:         files ldap

hosts:          files dns
networks:       files

protocols:      files
services:       files
ethers:         files
rpc:            files

netgroup:       nis

By default group membership is checked for all databases listed in the 'group:' line. So every time apache spawns a process it queries files and ldap to determine in what groups the apache user(www-data) is a member of, nscd was not caching this query.

I came to find out that the enumeration of groups is not cached in the Sarge nscd version, 2.3.2, which renders it useless to use nscd to cache LDAP data in Sarge

Group caching was added in 2.3.3 according to this nscd changelog.

I would really like to upgrade nscd to the latest version on my Sarge boxes, so I can obtain this functionality. I haven't been able to find a backport of nscd. Upgrading to the latest in unstable would necessitate me upgrading glibc as well.

I was warned that it is not advisable to upgrade to a different version of glibc then what is in stable, is this still the case?

I tried compiling the latest version of the nscd package on Sarge, but it appears that you have to compile glibc as well. Is there a way to compile nscd against the version of glibc in Sarge?

Any suggestions other than nscd to the LDAP caching problem?

 

 


Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by Anonymous (158.109.xx.xx) on Mon 24 Apr 2006 at 12:16
Are you sure nscd was not caching data AT ALL ?

That's very strange and in my opinion, it should be marked as a critical bug.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by lollipop (198.63.xx.xx) on Mon 24 Apr 2006 at 14:30
with the changes I made nscd is caching some data, it caches hosts and userids, however it does not cache the enumeration of what groups a user is in. For instance if you type 'id ' this group data is not cached. You can verify this my using 'nscd -d' and watching what data is cached. It happens that apache enumerates its groups every time it spawns a new thread, which then fails when ldap is not accessible.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by Anonymous (158.109.xx.xx) on Mon 24 Apr 2006 at 16:40
Why don't you report it to debian developers please ?

I don't know a lot about nscd. But if it is supposed to cache the data and it is not caching it at all ... that's a BIG problem.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by Anonymous (62.168.xx.xx) on Mon 24 Apr 2006 at 16:58
Haven't tried, but:

http://packages.debian.org/nss-updatedb

may help...

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by lollipop (198.63.xx.xx) on Mon 24 Apr 2006 at 18:41
thanks for the pointer...

I had looked at this before, but got sidetracked tring to determine exactly what was happening with nscd.

I just gave it a whirl and it seems to work well when the ldap server is inaccessible.

It does not however handle the caching problem, which means individual servers can still flood my ldap servers with queries.

I think nss-updatedb would work best in combination with a working nscd package.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by simonw (84.45.xx.xx) on Mon 24 Apr 2006 at 17:46
[ View Weblogs ]
Shouldn't you have redundant LDAP servers?

I mean it is standard practice whereever I've been to have 3 NIS servers, 3 ADS servers, 3 DNS servers, and make all critical network services triply redundant.

Not that this solves the bug, but it should make it a lot less obvious.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by lollipop (198.63.xx.xx) on Mon 24 Apr 2006 at 18:32
I definitely agree with the general sentiment that all critical services in an environment should be as maximally redundant as possible. However, having a reliable environment requires redundance on mulitple levels.

What happens to your three redundant ldap servers when a firewall change prevents your dmz web servers from querying your internal ldap servers?

How do your redundant ldap servers help when a commit to the master ldap server database breaks group queries and this broken entry is replicated to all your slave ldap servers?

What happens when your apache servers start spawning hundreds of threads because of a broken cgi script, flooding all you ldap servers with queries, resulting in ldap timeouts across you whole lan?

I have experienced all of theses problems, and a working nscd program would help to avoid them.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by paulgear (203.206.xx.xx) on Sun 30 Apr 2006 at 10:33
For what it's worth, i run sarge with an etch libc6 on several of my servers, and it works fine. I did this because i wanted to run the virus scanning version of Dan's Guardian from etch. It requires some packages to be upgraded:

Depends: libc6 (>= 2.3.5-1), libclamav1 (>= 0.86.2), libesmtp5 (>= 0.8.8), libgcc1 (>= 1:4.0.0-9), libstdc++6 (>= 4.0.1), zlib1g (>= 1:1.2.1),

I did this by installing the etch versions, and it works fine. I have two different systems doing this, and i'm still tossing up whether i should continue upgrading as new etch versions are released. My gut feeling is no, but i may change this depending on security issues.

I also pin shorewall and adzapper to testing for all my sarge systems.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by paulgear (203.206.xx.xx) on Thu 28 Sep 2006 at 13:20
For what it's worth, i've backed out this change on my sarge systems, and now i just use the Dan's Guardian package from backports.org. Using backports means that the system keeps a lot closer to etch without actually being etch, and i expect it will make the upgrade to etch smoother.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by baffle (195.0.xx.xx) on Wed 10 May 2006 at 09:50
[ View Weblogs ]
Did you ever find a good way of solving this?

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by lollipop (198.63.xx.xx) on Wed 17 May 2006 at 19:18

I was unable to solve the caching problem as even in the latest version of nscd there are problems with nscd invalidating group data too early, bug 173019. So I decided it wasn't worth attempting to get the latest nscd sources to compile against sarge's glibc.

Instead I focused on cases where the ldap server was inaccessible. I used nss-updatedb to permanently cache the group and user data. This solution works well, since our ldap database is fairly small. In addition I added these values to my libnss-ldap.conf:

timelimit 5
bind_timelimit 1
bind_policy soft

which helps to prevent queries to ldap from just hanging indefinitely.

[ Parent ]

Re: Help: Backporting nscd to sarge or upgrading glibc?
Posted by Anonymous (62.181.xx.xx) on Fri 12 May 2006 at 23:53
Are you sure you have checked your claims properly ?

We are using nscd on 20+ Debian sarge machines with user data stored in LDAP and I have just checked it works without any problems (passwd + group data).

Please provide more information, because I would be tempted to say that your claims are not valid.

[ Parent ]

Posted by lollipop (198.63.xx.xx) on Wed 17 May 2006 at 19:29

I did a fair amount of investigative work and I believe my claims are correct.

With regards to nscd not caching data, please see my comment on the debian bug, 345168. Perhaps the nscd sockets are created correctly under some circumstances, but they were not during my install attempts.

As to my claims that nscd is not caching initgroup entries, this is fairly well documented in debian and redhat bugs, as well as the glibc changelog. Even the ability to cache initgroup data in the latest versions is still broken, 173019.

[ Parent ]