SEARCH
NEW RPMS
DIRECTORIES
ABOUT
FAQ
VARIOUS
BLOG

 
 
Changelog for ctdb-tests-2.1-5.2.x86_64.rpm :
Thu Mar 1 13:00:00 2012 : Version 1.13
- This is the new stable branch for modern features for ctdb.
Main new features are performance/scaling improvements for
concurrnet fetch and fetch_lock operations.

Tue Nov 8 13:00:00 2011 : Version 1.12
- Add new tunable : AllowClientDBAttach that can be used to stop
client db access during maintenance operations
- Updated logging for interfaces that are missing or dont exist but are
configured to be used.
- Add timeout argument to ctdb_cmdline_client
- PDMA support
- Initial support for \'readonly\' delegations for ctdb databases
This will when finished greatly improve performance for contended hot
records that are used for just read-access.
- New \'ctdb cattdb\' command
- Massive updates to tests and eventscripts
- LCP2 ip allocation algorithm
- Record Fetch collapse. Collapse multiple fetch-lock requests from cients
to a single network fetch and defer other concurrent requests until the
initial fetch completes, and then service the deferred calls locally.
This will greatly improve performance for contended hot records
where clients request write-locks.

Thu Oct 21 14:00:00 2010 : Version 1.10
- New version 1.10

Tue May 25 14:00:00 2010 : Version 1.9
- Lots of changes

Thu Mar 25 13:00:00 2010 : Version 1.0.114
- Lots of changes from Metze

Wed Jan 13 13:00:00 2010 : Version 1.0.113
- Incorrect use of dup2() could cause ctdb to spin eating 100% cpu.

Tue Jan 12 13:00:00 2010 : Version 1.0.112
- Revert the use of wbinfo --ping-dc as it is proving too unreliable.
- Minor testsuite changes.

Fri Dec 18 13:00:00 2009 : Version 1.0.111
- Fix a logging bug when an eventscript is aborted that could cause a crash.
- Add back cb_status that was lost in a previous commit.

Fri Dec 18 13:00:00 2009 : Version 1.0.110
- Metxe: fix for filedescriptor leak in the new eventscript code.
- Rusty: fix for a crash bug in the eventscript code.

Thu Dec 17 13:00:00 2009 : Version 1.0.109
- Massive eventscript updates. (bz58828)
- Nice the daemon instead of using realtime scheduler, also use mlockall() to
reduce the risk of blockign due to paging.
- Workarounds for valgrind when forking once for each script. Valgrind consumes
massive cpu when terminating the scripts on virtual systems.
- Sync the tdb library with upstream, and use the new TDB_DISALLOW_NESTING flag.
- Add new command \"ctdb dumpdbbackup\"
- Start using the new tdb check framework to validate tdb files upon startup.
- A new framework where we can control health for individual tdb databases.
- Fix a crash bug in the logging code.
- New transaction code for persistent databases.
- Various other smaller fixes.

Mon Dec 7 13:00:00 2009 : Version 1.0.108
- Transaction updates from Michael Adam.
- Use the new wbinfo --ping-dc instead of -p in the eventscript for samba
the check if winbindd is ok.
- Add a better \"process-exist\" for samba so it will automatically
reap smbd\'s on stopped and banned nodes to reclaim subrecords.
This will be done a bit differently in the next release.
- Use a statically allocated buffer for the \'first-time\' capture buffer
to reduce the pressure on malloc/free.

Wed Dec 2 13:00:00 2009 : Version 1.0.107
- fix for rusty to solve a double-free that can happen when there are
multiple packets queued and the connection is destroyed before
all packets are processed.

Tue Dec 1 13:00:00 2009 : Version 1.0.106
- Buildscript changes from Michael Adam
- Dont do a full recovery when there is a mismatch detected for ip addresses,
just do a less disruptive ip-reallocation
- When starting ctdbd, wait until all initial recoveries have finished
before we issue the \"startup\" event.
So dont start services or monitoring until the cluster has
stabilized.
- Major eventscript overhaul by Ronnie, Rusty and Martins and fixes of a few
bugs found.

Thu Nov 19 13:00:00 2009 : Version 1.0.105
- Fix a bug where we could SEGV if multiple concurrent \"ctdb eventscript ...\"
are used and some of them block.
- Monitor the daemon from the syslog child process so we shutdown cleanly when
the main daemon terminates.
- Add a 500k line ringbuffer in memory where all log messages are stored.
- Add a \"ctdb getlog \" command to pull log messages from the in memory
ringbuffer.
- From martin : fixes to cifs and nfs autotests
- from michael a : fix a bashism in 11.natgw

Fri Nov 6 13:00:00 2009 : Version 1.0.104
- Suggestion from Metze, we can now use killtcp to kill local connections
for nfs so change the killtcp script to kill both directions of an NFS
connection.
We used to deliberately only kill one direction in these cases due to
limitations.
- Suggestion from christian Ambach, when using natgw, try to avoid using a
UNHEALTHY node as the natgw master.
- From Michael Adam: Fix a SEGV bug in the recent change to the eventscripts
to allow the timeout to apply to each individual script.
- fix a talloc bug in teh vacuuming code that produced nasty valgrind
warnings.
- From Rusty: Set up ulimit to create core files for ctdb, and spawned
processes by default. This is useful for debugging and testing but can be
disabled by setting CTDB_SUPRESS_COREFILE=yes in the sysconfig file.
- Remove the wbinfo -t check from the startup check that winbindd is happy.
- Enhance the test for bond devices so we also check if the sysadmin have
disabled all slave devices using \"ifdown\".

Tue Nov 3 13:00:00 2009 : Version 1.0.103
- Dont use vacuuming on persistent databases
- Michael A : transaction updates to persistent databases
- Dont activate service automatically when installing the RPM. Leave this to the admin.
- Create a child process to send all log messages to, to prevent a hung/slow syslogd
from blocking the main daemon. In this case, discard log messages instead and let the child
process block.
- Michael A: updates to log messages

Thu Oct 29 13:00:00 2009 : Version 1.0.102
- Wolfgang: fix for the vacuuming code
- Wolfgang: stronger tests for persistent database filename tests
- Improve the log message when we refuse to startup since wbinfo -t fails
to make it easier to spot in the log.
- Update the uptime command output and the man page to indicate that
\"time since last ...\" if from either the last recovery OR the last failover
- Michael A: transaction updates

Wed Oct 28 13:00:00 2009 : Version 1.0.101
- create a separate context for non-monitoring events so they dont interfere with the monitor event
- make sure to return status 0 in teh callback when we abort an event

Wed Oct 28 13:00:00 2009 : Version 1.0.100
- Change eventscript handling to allow EventScriptTimeout for each individual script instead of for all scripts as a whole.
- Enhanced logging from the eventscripts, log the name and the duration for each script as it finishes.
- Add a check to use wbinfo -t for the startup event of samba
- TEMP: allow clients to attach to databases even when teh node is in recovery mode
- dont run the monitor event as frequently after an event has failed
- DEBUG: in the eventloops, check the local time and warn if the time changes backward or rapidly forward
- From Metze, fix a bug where recovery master becoming unhealthy did not trigger an ip failover.
- Disable the multipath script by default
- Automatically re-activate the reclock checking if the reclock file is specified at runtime. Update manpage to reflect this.
- Add a mechanism where samba can register a SRVID and if samba unexpectedly disconnects, a message will be broadcasted to all other samba daemons.
- Log the pstree on hung scripts to a file in /tmp isntead of /var/log/messages
- change ban count before unhealthy/banned to 10

Thu Oct 22 14:00:00 2009 : Version 1.0.99
- Fix a SEGV in the new db priority code.
- From Wolfgang : eliminate a ctdb_fatal() if there is a dmaster violation detected.
- During testing we often add/delete eventscripts at runtime. This could cause an eventscript to fail and mark the node unhealthy if an eventscript was deleted while we were listing the names. Handle the errorcode and make sure the node does not becomne unhealthy in this case.
- Lower the debuglevel for the messages when ctdb creates a filedescruiptor so we dont spam the logs with these messages.
- Dont have the RPM automatically restart ctdb
- Volker : add a missing transaction_cancel() in the handling of persistent databases
- Treat interfaces with the anme ethX
* as bond devices in 10.interfaces so we do the correct test for if they are up or not.

Tue Oct 20 14:00:00 2009 : Version 1.0.98
- Fix for the vacuuming database from Wolfgang M
- Create a directory where the test framework can put temporary overrides
to variables and functions.
- Wait a lot longer before shutting down the node when the reclock file
is incorrectly configured, and log where it is configured.
- Try to avoid running the \"monitor\" event when databases are frozen.
- Add logging for every time we create a filedescriptor so we can trap
fd leaks.

Wed Oct 14 14:00:00 2009 : Version 1.0.97
- From martins : update onnode.
Update onnode to allow specifying an alternative nodes file from
the command line and also to be able to specify hostnames on the
list of targets :
onnode host1,host2,...

Tue Oct 13 14:00:00 2009 : Version 1.0.96
- Add more debugging output when eventscripts have trouble. Print a
\"pstree -p\" to the log when scripts have hung.
- Update the initscript, only print the \"No reclock file used\" warning
when we do \"service ctdb start\", dont also print them for all other
actions.
- When changing between unhealthy/healthy state, push a request to the
recovery master to perform an ip reallocation instead of waiting for the
recovery master to pull and check the state change.
- Fix a bug in the new db-priority handling where a pre-.95 recovery master
could no longer lock the databases on a post-.95 daemon.
- Always create the nfs state directories during the \"monitor\" event.
This makes it easier to configure and enable nfs at runtime.
- From Volker, forward-port a simper deadlock avoiding patch from the 1.0.82
branch. This is a simpler versionof the \"db priority lock order\" patch
that went into 1.0.95, and will be kept for a few versions until samba
has been updated to use the functionality from 1.0.95.

Mon Oct 12 14:00:00 2009 : Version 1.0.95
- Add database priorities. Allow samba to set the priority of databases
and lock the databases in priority order during recovery
to avoid a deadlock when samba locks one database then blocks indefinitely
while waiting for the second databaso to become locked.
- Be aggressive and ban nodes where the recovery transaction start call
fails.

Sat Oct 10 14:00:00 2009 : Version 1.0.94
- Be very aggressive and quickly ban nodes that can not freeze their databases

Thu Oct 8 14:00:00 2009 : Version 1.0.93
- When adding an ip, make sure to update this assignment on all nodes
so it wont show up as -1 on other nodes.
- When adding an ip and immediately deleting it, it was possible that
the daemon would crash accessing already freed memory.
Readjust the memory hierarchy so the destructors are called in the right order.
- Add a handshake to the recovery daemon to eliminate some rare cases where
addip/delip might cause a recovery to occur.
- updated onnode documenation from Martin S
- Updates to the natgw eventscript to allow disabling natgw at runtime

Fri Oct 2 14:00:00 2009 : Version 1.0.92
- Test updates and merge from martin
- Add notification for \"startup\"
- Add documentation for notification
- from martin, a fix for restarting vsftpd in the eventscript

Tue Sep 29 14:00:00 2009 : Version 1.0.91
- New vacuum and repack design from Wolgang Mueller.
- Add a new eventscript 01.reclock that will first mark a node unhealthy and later ban the node if the reclock file can not be accessed.
- Add machinereadable output to the ctdb getreclock command
- merge transaction updates from Michael Adam
- In the new banning code, reset the culprit count to 0 for all nodes that could successfully compelte a full recovery.
- dont mark the recovery master as a ban culprit because a node in the cluster needs a recovery. this happens naturally when using ctdb recover command so dont make this cause a node to be banned.

Sat Sep 12 14:00:00 2009 : Version 1.0.90
- Be more forgiving for eventscripts that hang during startup
- Fix for a banning bug in the new banning logic

Thu Sep 3 14:00:00 2009 : Version 1.0.89
- Make it possible to manage winbind independently of samba.
- Add new prototype banning code
- Overwrite the vsftpd state file instead of appending. This eliminates
annoying errors in the log.
- Redirect some iptables commands to dev null
- From Michael A, explicitely set the broadcast when we takeover a public ip
- Remove a reclock file check we no longer need
- Skip any persistent database files ending in .bak

Mon Aug 17 14:00:00 2009 : Version 1.0.88
- Add a new state for eventscripts : DISABLED.
Add two new commands \"ctdb enablescript/disablescript\" to enable/disable
eventscripts at runtime.
- Bugfixes for TDB from rusty.
- Merge/Port changes from upstream TDB library by rusty.
- Additional new tests from MartinS. Tests for stop/continue.
- Initial patch to rework vacuuming/repacking process from Wolfgang Mueller.
- Updates from Michael Adam for persistent writes.
- Updates from MartinS to handle the new STOPPED bit in the test framework.
- Make it possible to enable/disable the RECMASTER and LMASTER roles
at runtime. Add two new commands
\"ctdb setlmasterrole/setrecmasterrole on/off\"
- Make it possible to enable/disable the natgw feature at runtime. Add
the command \"ctdb setnatgwstate on/off\"

Fri Jul 17 14:00:00 2009 : Version 1.0.87
- Add a new event \"stopped\" that is called when a node is stopped.
- Documentation of the STOPPED flag and the stop/continue commands
- Make it possible to start a node in STOPPED mode.
- Add a new node flag : STOPPED and commands \"ctdb stop\" \"ctdb continue\"
These commands are similar to \"diasble/enable\" but will also remove the node from the vnnmap, while disable only fails all ip addresses over.
- tests for NFS , CIFS by martins
- major updates to the init script by martins
- Send gratious arps with a 1.1 second stride instead of a 1 second stride to workaround interesting \"features\" of common linux stacks.
- Various test enhancements from martins:
- additional other tests
- add tests for grat arp generation, ping during failover, ssh and failover
- New/updated tcp tickle tests and supprot functions
- provide better debugging when a test fails
- make ctdbd restarts more reliable in the tests
- update the \"wait bar\" to make the wait progress in tests more obvious
- various cleanups
- when dispatching a message to a handler, make the message a real talloc object so that we can reparent the object in the tallic hierarchy.
- document the ipreallocate command
- Updates to enable/disable to use the ipreallocate command to block until the following ipreallocation has completed.
- Update the main daemon and the tools to allow debug level to be a string instead of an integer.
- Update the sysconfig file to show using string literals instead of numeric values for the debuglevels used.
- If no debuglevel is specific, make \"ctdb setdebug\" show the available options.
- When trying to allocate network packets, add explicit checks if the network transport has been shutdown before trying and failing, to make log messages easier to read. Add this extra check and logging to every plave packets are allocated.

Tue Jun 30 14:00:00 2009 : Version 1.0.86
- Do not access the reclock at all if VerifyRecoveryLock is zero, not even try to probe it.
- Allow setting the reclock file as \"\", which means that no reclock file at all should be used.
- Document that a reclock file is no longer required, but that it is dangerous.
- Add a control that can be used to set/clear/change the reclock file in the daemon during runtime.
- Update the recovery daemon to poll whether a reclock file should be sued and if so which file at runtime in each monitoring cycle.
- Automatically disable VerifyRecoveryLock everytime a user changes the location of the reclock file.
- do not allow the VerifyRecoveryLock to be set using ctdb setvar if there is no recovery lock file specified.
- Add two commands \"ctdb getreclock\" and \"ctdb setreclock\" to modify the reclock file.

Tue Jun 23 14:00:00 2009 : Version 1.0.85
- From William Jojo : Dont use getopt on AIX
- Make it possible to use \"ctdb listnodes\" also when the daemon is not running
- Provide machinereadable output to \"ctdb listnodes\"
- Dont list DELETED nodes in the ctdb listnodes output
- Try to avoid causing a recovery for the average case when adding/deleting/moving an ip
- When banning a node, drop the IPs on that node only and not all nodes.
- Add tests for NFS and CIFS tickles
- Rename 99.routing to 11.routing so it executes before NFS and LVS scripts
- Increase the default timeout before we deem an unresponsive recovery daemon hung and shutdown
- Reduce the reclock timout to 5 seconds
- Spawn a child process in the recovery daemon ot check the reclock file to
avoid blocking the process if the underlying filesystem is unresponsive
- fix for filedescriptor leak when a child process timesout
- Dont log errors if waitpid() returns -1
- Onnode updates by Martins
- Test and initscript cleanups from Martin S

Tue Jun 2 14:00:00 2009 : Version 1.0.84
- Fix a bug in onnode that could not handle dead nodes

Tue Jun 2 14:00:00 2009 : Version 1.0.83
- Document how to remove a ndoe from a running cluster.
- Hide all deleted nodes from ctdb output.
- Lower the loglevel on some eventscript related items
- Dont queue packets to deleted nodes
- When building initial vnnmap, ignode any nonexisting nodes
- Add a new nodestate : DELETED that is used when deleting a node from an
existing cluster.
- dont remove the ctdb socket when shutting down. This prevents a race in the
initscripts when restarting ctdb quickly after stopping it.
- TDB nesting reworked.
- Remove obsolete ipmux
- From Flavio Carmo Junior: Add eventscript and documentation for ClamAV antivirus engine
- From Sumit Bose: fix the regex in the test to handle the new ctdb
statistics output that was recently added.
- change the socket type we use for grauitious arps from the obsolete
AF_INET/SOCK_PACKET to instead use PF_PACKET/SOCK_RAW.
- Check return codes for some functions, from Sumit Bose, based on codereview by Jim Meyering.
- Sumit Bose: Remove structure memeber node_list_file that is no longer used.
- Sumit Bose: fix configure warning for netfilter.h
- Updates to the webpages by Volker.
- Remove error messages about missing /var/log/log.ctdb file from ctdb_diagnostics.sh from christian Ambach
- Additional error logs if hte eventscript switching from dameon to client mode fails.
- track how long it takes for ctdbd and the recovery daemon to perform the rec-lock fcntl() lock attemt and show this in the ctdb statistics output.

Thu May 14 14:00:00 2009 : Version 1.0.82
- Update the \"ctdb lvsmaster\" command to return -1 on error.
- Add a -Y flag to \"ctdb lvsmaster\"
- RHEL5 apache leaks semaphores when crashing. Add semaphore cleanup to the
41.httpd eventscript and try to restart apache when it has crashed.
- Fixes to some tests
- Add a -o option to \"onnode\" which will redirect all stdout to a file for
each of the nodes.
- Add a natgw and a lvs node specifier to onnode so that we can use
\"onnode natgw ...\"
- Assign the natgw address to lo instead of the private network so it can also
be used where private and public networks are the same.
- Add GPL boilerplates to two missing scripts.
- Change the natgw prefix NATGW_ to CTDB_NATGW_

Fri May 8 14:00:00 2009 : Version 1.0.81
- use smbstatus -np instead of smbstatus -n in the 50.samba eventscript
since this avoids performing an expensive traverse on the locking and brlock
databases.
- make ctdb automatically terminate all traverse child processes clusterwide
associated to a client application that terminates before the traversal is
completed.
- From Sumit Bose : fixes to AC_INIT handling.
- From Michael Adam, add Tridge\'s \"ping_pong\" tool the the ctdb distro since
this is very useful for testing the backend filesystem.
- From Sumit bose, add support for additional 64 bit platforms.
- Add a link from the webpage to Michael Adams SambaXP paper on CTDB.

Fri May 1 14:00:00 2009 : Version 1.0.80
- change init shutdown level to 01 for ctdb so it stops before any of the other services
- if we can not pull a database from a remote node during recovery, mark that node as a culprit so it becomes banned
- increase the loglevel when we volunteer to drop all ip addresses after beeing in recovery mode for too long. Make this timeout tuneable with \"RecoveryDropAllIPs\" and have it default to 60 seconds
- Add a new flag TDB_NO_NESTING to the tdb layer to prevent nested transactions which ctdb does not use and does not expect. Have ctdb set this flag to prevent nested transactions from occuring.
- dont unconditionally kill off ctdb and restrat it on \"service ctdb start\". Fail \"service ctdb start\" with an error if ctdb is already running.
- Add a new tunable \"VerifyRecoveryLock\" that can be set to 0 to prevent the main ctdb daemon to verify that the recovery master has locked the reclock file correctly before allowing it to set the recovery mode to active.
- fix a cosmetic bug with ctdb statistics where certain counters could become negative.

Wed Apr 8 14:00:00 2009 : Version 1.0.79
- From Mathieu Parent: add a ctdb pkgconfig file
- Fix bug 6250
- add a funciton remove_ip to safely remove an ip from an interface, taking care to workaround an issue with linux alias interfaces.
- Update the natgw eventscript to use the safe remove_ip() function
- fix a bug in the eventscript child process that would cause the socket to be removed.
- dont verify nodemap on banned nodes during cluster monitoring
- Update the dodgy SeqnumInterval to have ms resolution

Tue Mar 31 14:00:00 2009 : Version 1.0.78
- Add a notify mechanism so we can send snmptraps/email to external management systems when the node becomes unhealthy
- include 11.natgw eventscript in thew install so that the NATGW feature works

Tue Mar 31 14:00:00 2009 : Version 1.0.77
- Update the 99.routing eventscript to also try to add the routes (back) during a releaseip event. Similar to the reasons why we must add addresses back during releaseip in 10.interfaces

Tue Mar 24 13:00:00 2009 : Version 1.0.76
- Add a debugging command \"xpnn\" which can print the pnn of the node even when ctdbd is not running.
- Redo the NATGW implementation to allow multiple disjoing NATGW groups in the same cluster.

Tue Mar 24 13:00:00 2009 : Version 1.0.75
- Various updates to LVS
- Fix a bug in the killtcp control where we did not set the port correctly
- add a new \"ctdb scriptstatus\" command that shows the status of the eventrscripts.

Mon Mar 16 13:00:00 2009 : Version 1.0.74
- Fixes to AIX from C Cowan.
- Fixes to ctdb_diagnostics so we collect correct GPFS data
- Fixes to the net conf list command in ctdb_diagnostics
- Check the static-routes file IFF it exists in ctdb_diagnostics

Wed Mar 4 13:00:00 2009 : Version 1.0.73
- Add possibility to disable the check of shares for NFS and Samba
- From Sumit Bose, fix dependencies so make -j works


 
ICM