Ticket #1604 (reopened enhancement)

Opened 5 years ago

Last modified 11 months ago

Store chat logs in plain text or in sqlite

Reported by: anonymous Owned by: nkour
Priority: low Milestone: Patches Welcome
Component: history Version: 0.9.1
Severity: normal Keywords: plugin
Cc: mcepl@… Blocked By:
OS: All Blocking:

Description (last modified by johnny) (diff)

Gajim should store chat logs in plain text files instead of a database. Without the ability to store chatlogs as plain text files (as almost all other IM programs do it), Gajim is pretty much useless. I want to be able to grep over my chatlog as I always did, and I know lots of people who want to do the same. And I want to be able to store my chat logs and be able to look over them after years. A db dump does not help much there.

Attachments

Change History

Changed 5 years ago by Jim++

  1. You have an history manager (in svn) which is MUCH easier and better than grep.
  2. Love to search your logs in command line ? Juste use sqlite instead of grep.  http://trac.gajim.org/wiki/LogsDatabase
  3. AFAIK, Database storage is faster and scalable.

Changed 5 years ago by nicfit

Jim++, well said.

Changed 5 years ago by anonymous

With grep I can search through all my history files at once, no matter from which messenger they come. If I had to use sqlite for gajim, I had to search at least twice. That's not really what I want.

And what about storing the logs for later use? As I said before the sqlite database file is not really a file where I can be sure I can read it in the future (and with future I mean years from now), which I need to do, when I want to keep my logs.

For storing log files, which only one person accesses at once there is no need for speed or other advantages a database might offer. I doubt it is even faster to store in a database than to write to a text file. Same is true for scalability, there is no problem even if a log file grows to some megabytes of file size.

I really don't see why one would want to store the history in a database, there does not seem to be at least one real advantage over plain text files.

Changed 5 years ago by anonymous

Faster? I have to wait over 30 seconds, while gajim is loading history of my favorite muc. That sux.

Changed 5 years ago by nk

  • status changed from new to closed
  • resolution set to wontfix

the major advantage is searching the logs. another major is that when I did this, I rewrote the whole API which was very cryptic and minimalistic and problematic to maintain.

About the seek twice, just do a script if you really need to do it all at once.

most users use one IM client most of the times, most users like happy UIs and not unfriendly grep, most users should start using FLOSS and not always stick to Windows apps and Windows UIness.

last but not least, embeddable sql offers some easiness to developer (after he has the basic api and tables design done). And having the dev happy in a floss project where he's unpayed is #1 and then comes everything else you demand as 'demand' cannot easily exist without money.

have fun :)

ps. the way to search in the future is via Beagle and the such stuff and someone is doing (or was doing) a plugin for Gajim.

pps2. if you really wait 30 seconds, maybe move your db outside of NFS

Changed 5 years ago by asterix

  • status changed from closed to reopened
  • severity changed from major to normal
  • resolution wontfix deleted
  • summary changed from Gajim should store chat logs in plain text files instead of a database to Store chat logs in plain text or in sqlite
  • version set to svn
  • type changed from defect to enhancement

what can be done is to write 2 interfaces: one for db, one for plain text logs, and let the user choose (via ACE of course). Default will stay to db. so just patch us to provide both way (or I'll do but not in a near futur)

Changed 5 years ago by nk

  • priority changed from normal to lowest
  • version changed from svn to 0.9.1

wow, this seems really strange, Yann :)

it's not like we don't have other tickets to do. anyways if you won't do it in the near future, and if you're still strong on reopen, at least downgrade priority.. anyways I do it

Changed 5 years ago by dkirov

  • priority changed from lowest to normal

Yann I think it will be better if we keep one backend and have an ACE option keep_logs_also_as_text.

By activating it the logs will be kept both in the db and in plain text files. We'll do nothing with this plain texts, except writing to them. This is not difficult to implement and helps a lot, user can search in all contacts or with reg. exp, limit the results, do backups etc.

This way user (if he activates this option) will have both the history manager, which Gajim devs think is good for him and plain text files, which he thinks are good for him and all will be happy.

Changed 5 years ago by anonymous

  • version 0.9.1 deleted
  • type changed from enhancement to defect
  • severity changed from normal to major

Come on nk, I did not demand anything, but I told you what I think and if you like to ignore your users, do it, if you think it is the right way to deal with them. I really liked Gajim before this change to sqlite, but without text file logs it is useless for me and belive it or not there are lots of users out there who use multiple clients because they need to use other IM networks besides jabber and don't like these multi-messengers. On the other hand text files do not mean, that users have to use tools like grep to search. You can still implement a search feature inside of Gajim. So that isn't really an argument against it.

I think I could live with the solution dkirox suggests.

BTW, on Gentoo Gajim is still available as 0.8.2 only because newer versions either do not run at all or are very unstable. Don't know why that is the case though.

Changed 5 years ago by anonymous

  • version set to 0.9.1
  • type changed from defect to enhancement
  • severity changed from major to normal

Sorry, didn't meant to change these settings.

Changed 5 years ago by nk

bug against gentoo on .8.2

I won't continue this discussion as there is really no point. I just say I won't patch this ticket for sure. Everyone is welcome to do whatever he likes, even with bad code or not KIS, and when this is too much for me I'll resign from the project and that's it :)

Reporter, your tag is: "I use many non-multi-im clients, I like grep"

my tag is: "grep is for nerds and Gajim shouldn't target nerds but everyday people who just want to chat and they happen to use jabber because it's better than others and because it is open standard and scales well. those people won't do grep nor regexp not whatever"

fwiw (I really don't want to continue this) sqlite does regexp's if you like. so it's all about writing two freaking lines in bash shell. either one to dump the db to files and parse it, or to run sqlite and select whatever from whatever and parse it.

I have to say I'm surprised that both Dimitur and Yann even talk about still using text files, but I'm not mad or anything. Just curious noone here explained how that helps Gajim (having *TWO* ways to do the same thing). Since this is written word, I clearly state I'm not sad or mad or whatever and I'm really happy we have this conversation here :) [long live Gajim and all that :P]

Changed 5 years ago by nicfit

I figured I'd chime in on this. I'm in complete agreement with Nikos. A patch that does the seperate file thing with a default of OFF would be fine, but I see no reason to expend any effort on it.

FWIW, I've been thinking about the Single Message project that I hope to tackle for 0.11, and the DB is a key player if we wish to provide an email like experience for message type=normal|headline.

There is another ticket floating around about allowing a different DB backend. Now THAT is interesting because it could allow storing all my logs in one place. Considering I use three different machine regularlry having logs split over all three machines is less then useful regardless of whether those logs are a flat-file or DB. But that is another subject entierely.

I propose PATCHYOURSELF and close ;)

Changed 5 years ago by Jim++

Store logs in both database and plain file ?! How could you even think of it ? It's just a waste of space and cpu time. And the gajim code will be more complex.

  • You can be sure that it will be possible to access your logs in many years. I mean, sqlite is FREE SOFTWARE, it's not a black box !
  • If you need complex way of searching logs, history manager is here for that. I think it can be much improved. I don't understand in which way you want to be able to search your logs that cannot be done by history manager. Please tell us. And OH-MY-GOD WHO use reg. exp. to search logs ?! It's just impossible that the few people who are capable of doing this aren't capable of doing it in sqlite. They must just be lazy.
  • If you want to be able to search multiple IM client history, do or wait beagle plugin (  http://trac.gajim.org/ticket/647 ). Are you really meaning that you have a simple way to search in multiple IM logs in one time ?

Changed 5 years ago by anonymous

nk, forget about the grep thing. Even more important than that is storing the logs somewhere for later use. I want to be able to read the logs in several years from now, and with text files I'm sure there will be no problem doing that, but it is a completely different story with a sqlite database file.

Jim++ wrote:

Are you really meaning that you have a simple way to search in multiple IM logs in one time ?

Yes I do. I can simply use grep and search over all the log files in one run. Simple as that. By the way, having text files for the logs doesn't mean everyone has to use grep or something to search in the logs (so the nerd argument does not count), most IM clients have search functions and I am not aware of any other client except Gajim that uses a (sql) database for storing the logs.

Changed 5 years ago by dkirov

Jim++ wrote: Store logs in both database and plain file ?! How could you even think of it ? It's just a waste of space and cpu time.

Jim++, I really don't know how could I ever think of it. I'm just so glad that there are cpu experts who can tell which way is better. I hope you don't mind if you share your cpu optimization knowledge in #1577. It is about avatars, they are made with the KISS technology, so you probably won't have any problems going through the code. I don't use avatars at all, and it will be good if someone with cpu knowledge solves this problem. I can't see any example of this cpu thing that everybody talk about and I start to believe that it is a dream. Be the one that will write some sample code and will make it real.

About this ticket, forget what I said, I use logs very rarely and I really don't care how they'll be done. I just tried to prevent about 134 possible new bugs. Yann proposed to have two backends and I said that instead of two backends it is better to have text files only for storing logs, without the need to use them for history manager, which means that it will have the same result with less code. So, do what you want, I really don't care about logs.

Jim++ wrote: And the gajim code will be more complex.

Complex code? Try to go through xmpppy library and see on what Gajim relies, or try to follow the logic in the >3000 lines of code in roster_window.py where main application routines are implemented (incl. avatars). I'll definitely say that watching Chelsea - Barsa is far more pleasant thing to do.

Changed 5 years ago by nk

really, I hope this is the last time, I write in this ticket :)

as Jim++ said sqlite is used in many PHP5 projects and it was awarded as project of the year and all that. it's not something that is going to die, but let's say all die, then you can just dump the db.

I didn't do any test on which clients do db as I do not care nor think Psi or whatever offers high quality history and history manager. I suspect that Trillian uses DB as they have the coolest history manager I've ever seen (They even do stats on day on when you talk more during the day).

remember, you're not forced to use Gajim, just use what suits you best. Psi runs on GNOME/GTK envs quite well and you can enjoy it's great history manager and logs.

Dimitur, I do not understand why you feel like attacking everyone you disagree with. The fact that xmpppy code is complex (and I might say badly documented) is true thing but it's not our code, and even if it was we couldn't really say that hey: "we have bad code, let's do some more". I congrats you for understanding and mastering it and doing the great nb stuff and I said that from day one. so now please stop attacking opinions so harshly and just watch Champions League every Tuesday and Wednesday every 15 days as I do and that's all :P

Chelsea sucked big time, and Real Madrid vs Arsenal (Henry is great) as well as Werder vs Juventus were real matches :P

Changed 5 years ago by dkirov

nk, I can never disagree with a new invention about the shape of the Earth.

It is highly inappropriate while I'm dealing with optimization fixes of code written not by me to receive an advice which is more suitable for my grandma. Surprised of the fact that we have a QA in the team I asked him to do his real job and take responsibility of the real problems that we face. And now it is obvious to me that it wasn't the QA that was talking, but only his speaker.

About the CL: yeah Barsa rock! and Arsenal never played that well in the Champions League and still many things could happen in the next round!

Changed 4 years ago by nicfit

Not sure why this ticket showed up in my RSS feed since the last "action"

seemed to be 2/23/2006, but since it did and it is still open I'll add a few more "cents".

Anonymous, since you are so familar with Unix command line why don't you just write a cron job that runs sqlite to dump the DB contents to a plain text file, formatted however you like? Should not be too hard, and the script *you* come up with can possibly be added to the Wiki for other users who are so opposed to a more triumphant logs backend.

Changed 4 years ago by nk

  • status changed from reopened to closed

I extracted some info by Dimitur about backing up the db and sent it to

Yann so it will be soon a WIKI PAGE about it.

Changed 4 years ago by ccarlin@…

  • status changed from closed to reopened

From the talk attached to this thread one would think that there are NO

downsides to the database backend. This is clearly not the case.

Firstly, for all users the reliance on sqlite adds a dependency. Believe it or not, not everyone has sqlite and the python wrappers on their machines. It's just one more thing to require the user to obtain, keep track of, and possibly break. It certainly takes additional system resources to run a database backend.

Next, the tracker has other cases where people have run into problems caused by sqlite itself. For example, look at the laptop users who have their laptops spin up with every message sent or received. They're assured that this is required by the library, which makes the feature of the library. See, for example, #2183

In my case I have three seconds of disc churn every time a send or receive a logged message.

So the database backend definitely has drawbacks which, in some cases, are severe. That in itself calls for the implementation of a plain text log, at least as an option. I'd suggest it as the default one since it's guaranteed to work for every user.

Finally, the worksforme resolution is clearly undeserved since even if the user created his cron script to export a text log, gajim would still not be storing chat logs in plain text. Therefore, no, the feature does not work for you.

Changed 4 years ago by nk

first of all, hd changes are cached and commited to hd after foo secs

(kernel stuff) AFAIK. you can always disable logging if you want to save battery or whatever.

the fact that's doable, doesn't mean it should be done as it adds complexity with no obvius gain to the everyday usage of Gajim.

the added dependency, if you can't really install yet another dep, and if you use a strange and obscure distro that doesn't ship packages for those deps we req, maybe use another client which suits your minimalistic needs.

Gajim doesn't target to be minimalistic, targets to be easy and fun for the users to use and DEVS to dev on it.

Changed 4 years ago by anonymous

Sqlite requests that changes be immediately flushed to the disk. There is

no chance of caching them, and even the new laptop mode with aggressive caching obeys sqlite's call for a flush. Check the cited bug where gajim developers clarify that this is the correct behavior, that a sqlite backend properly spins up laptop harddrives for every message.

These aren't just gains for everyday use: for many people, laptop users in particular, this stuff makes gajim unusable. For others they are clear gains for every day use. The sqlite access causes a significant delay upon every message logged.

Gajim is fun to use, etc. However, this sqlite requirement keeps all of that from many users.

Changed 4 years ago by asterix

  • priority changed from normal to low

I agree that, if possible, we leave the choice to the user of the way to

save logs (DB or files) but this is not a priority at all for us. If you want to help us on that, just take the logger class, and write the func that are in it using text files, we'll include it in our trunk.

Changed 4 years ago by ccarlin@…

I may actually be able to work on this at some point (python is my

favorite language, after all). For now I simply didn't want to see this bug improperly closed or have the request spoken about as unreasonable or pointless.

Changed 3 years ago by mildred

I also want to say that I don't really like the sqlite backend, for various reasons : - It is not guaranteed that I can access the logs in the future as the sqlite database format is far more complex that ASCII or UTF8 (event if sqlite is free software) - I can't grep the logs (and that's useful for example to search insde logs from various softwares at the same time, like xchat - I can't use gajim on two computers because the database can't be easily merged. Or I can't search for logs in two databases at the same time (this is a major drawback for me)

I just want to add this to that ticket because even if I don't often grep my logs (maybe once per year at most) it is still an issue for me.

Changed 3 years ago by mildred

See DatabaseExtract to have the script I made to export logs from the database

Changed 22 months ago by mcepl

  • cc mcepl@… added
  • os set to All

Changed 11 months ago by johnny

  • keywords plugin added
  • description modified (diff)
  • milestone set to Patches Welcome

support this as a plugin

Add/Change #1604 (Store chat logs in plain text or in sqlite)

Author


E-mail address and user name can be saved in the Preferences.


Change Properties
<Author field>
Action
as reopened
as The resolution will be set. Next status will be 'closed'
to The owner will change from nkour. Next status will be 'new'
Next status will be 'needinfo'
 
Note: See TracTickets for help on using tickets.