Ticket #1577 (closed defect: invalid)

Opened 4 years ago

Last modified 4 years ago

smart+question

Reported by: nk Owned by: asterix
Priority: normal Milestone: 0.10
Component: None Version: hg
Severity: normal Keywords:
Cc: Blocked By:
OS: Blocking:

Description

especially on virgin box this makes Gajim take 5 minutes on big roster with epople with big avatars to start becoming usuable

Attachments

gajim_fast.patch (36.4 KB) - added by dkirov 4 years ago.

Change History

Changed 4 years ago by deckrider

A minor suggestion on this: don't ask for avatar if that contact is offline.

Changed 4 years ago by dkirov

In fact vcards are queried progressively, but roster is drawn at once. During this draw each avatar is read from filesystem and resized to fit the roster size. Maybe if we read vcards/avatars asynchronously it will help a little, but this will not be possible on win32 because file descriptors are not selectable. What we can do is draw roster items progressively. I found a similar problem with service browser: conference.jabber.org has more than 2500 rooms. If we try to browse this service gajim freeze forever. A good solution to prevent this problem will be to draw the items progressively, something like:

drawrows(rows_list):
    .....
    first_ten = rows_list[:10]
    remaining = rows_list[10:]
    draw(first_ten) # here comes the real drawing 
    timeout_add(10, drawrows, remaining)
    ....

There are some concurrency issues, which should be considered. Also see #607

Changed 4 years ago by dkirov

One more thing that can help, I read it somewhere in pygtk FAQ:

tree.freeze_child_notify()
#.... add tree items here
tree.thaw_child_notify()

This causes the draw operation to be performed only once, after the tree model is filled.

Changed 4 years ago by nk

deckrider's proposal is KISS and I like it.

dimitur, yes we must do this freeze for sure and in roster too!

for general trick and after we have the first 2 and still not visual solution, to do hard work but also avoid UI freezing we use generators and call the genrator in a gobject.idle_add. it's in PyGTK FAQ and you can find this trick in Gajim as I do it in history_window.py to draw bold on calendar progressively.

Changed 4 years ago by dkirov

gobject.idle_add and gobject.timeout_add with a small timeout and calback returning true have the same effect. I would prefer using timeout_add, because it gives you the flexibility to insert on some of the following lines another idle function, which can be called before the first one (if first timeout is 5 msec. and the second 4 msec.) and without messing up priorities.

My greatest concern is about concurrency. We clean and draw the whole tree each time some of our accounts connects or disconnects to server.

Here is sample situation, that will lead to wrong list in roster window:

Lets say we have accounts A and B and their contacts are called A1, A2.. A100 and B1,B2...B200

We try to connect both accounts. First Account A get to state online and receives from server his contacts, which are called A1, A2.... A100. We draw some of them, lets say A1...A10 and we set timeout/idle_add for the others. Next, account B comes to state online and this causes the roster to be cleaned, at the same loop account B starts to draw roster items again and this includes his contacts and the contacts of account A. The timeout for drawing A11..A100 is still in glib idle queue, which means that we'll draw some items twice, or we'll die with TB, because root element is missing.

There are many ways to get out of this problem, source_remove on the pending events is one of them. However I would prefer if we don't redraw the whole roster each time one of our accounts change his status! LetsiImagine? I have one account with 200 contacts and another with 10. If I change the status of the second account, gajim will clean and draw 210 lines, which is not necessary.

I saw that big avatars make Gajim unusable not only in roster, but in contact info dialog. This should be investigated more precisely.

Changed 4 years ago by nk

idle_add with generator yielding true is the best way as it's executed when there are no gtk events (UI update etc)

and we should use gtkmodelfilter. fill the model *once* then use filter to what to show (!offline and when show offline show all)

big avatars is a PITA. if user is 56k and has to get 500 kb avatar we can't do much really!

Changed 4 years ago by dkirov

I agree about gtkmodelfilter, and if we don't use make_menus roster drawing will be much faster. timeout_add is executed when there are no gtk events as well :) In fact both idle_add and timeout_add have by default PRIORITY_DEFAULT_IDLE, which is greater than priority of gtk events so callbacks are executed after there are no gtk pending events.

---

Looking at simplexml.py code I think I found what can cause this leak. It is hard to explain though.

So far, lots of refactoring have to be done!

Changed 4 years ago by nk

modelfilter is #1201

I dont understand why make_menus is a problem. because of X11 async dying? and how can we overcome this??

Changed 4 years ago by dkirov

First, it is a problem, because it is slow: 0.2 - 0.5 sec. and during this time gajim is frozen. Second, it is called in occasions where there is no need to be called and sometimes more than once in a single action: see #607

Let's say it take approximately 0.3 sec. for make_menu to complete a single call. When I go to state online with one of my accounts Gajim will be frozen for 0.9 sec. just because of making the menus. If I connect three accounts at one and the same time the freeze time will be 2.7 sec.

We can create these menus on the fly: when I click "Actions" the menu items for service discovery are added and so on.

At least this will prevent X11 async dying.

Changed 4 years ago by asterix

make_menu is now called only when really needed. we now don't c,ear and refill the whole roster when we connect to a second account

Changed 4 years ago by asterix

if we show offline contacts, we need to ask their vcards

Changed 4 years ago by nk

if user has show offline contacts, we don't care, we just ask vcards for the currently online. this progressively (if they ever come online) fills them all with vcards/avatars. also if user opens to send a msg to an offline that we don't have avatar, we could ask.

I just drop ideas here..

also I'm not sure about the model freeze thingy as it's not like we add so many rows. moreover from the history manager I did I noticed they do a pretty good job if you add hunders of rows in model and they do not block the UI

Changed 4 years ago by asterix

why not just ask them progressively ? (let's say max 1 per seconds) this way we won't freeze and remember it's done only once

Changed 4 years ago by nk

ok you can start with that. we'll have to see what suits us better try rm you vcards and avatars and you'll know for sure when the solution we have is the best :) (I trust you have many roster items :P)

Changed 4 years ago by dkirov

I think there is no need of such a big timeout. If we put the drawing function in gtk idle queue of events this function will be called when there are no other gui events, this can be after 1 sec. or after 1msec, depending on how busy is gtk.

Instead of

draw_contact(...)
draw_avatar(...)
draw_contact(...)
gobject.idle_add(draw_contact, ...)
# or
# gobject.timeout_add(10, draw_avatar, ...)

However whatever we do, loading 1mb jpeg will always make Gajim freeze for a while.

Another idea: if composing common.xmpp.Node(...) is still slow, I can make it work asynchronously.

Changed 4 years ago by nk

Node() is fast now :)

about idle_add we also need to yield True when more work to do. this is because GTK can say: "hey ok go, do this." but 1 sec later may say: "hey I want CPU" and we say: "no way. wait 10 secs"

Changed 4 years ago by dkirov

I don't think I get it.

Drawing an image is a single operation. It cannot be devided into separate subtasks, (if it can, this will be perfect). Because it is a single operation, we don't need to execute it twice, that's why we should return False.

There is no difference when you draw an image now, or after 10 sec. It is gtk's job to decide when to do the pending events, if we try to take its place, we'll probably fail.

Changed 4 years ago by nk

if we have 350 kb avatar, just image parsing the hash of it.

you also said that expat has a limitation. so we may also fail, so we must be sure we catch and skip so bad and big avatars (afterall only them can produce 400 kb xml)

Changed 4 years ago by nk

image => imagine*

Changed 4 years ago by dkirov

Well, if we have 350 kb avatar and we try to load it in the first possible moment from now, it will be loaded for almost the same time as if we try to load it in the first possible moment after 10 sec. Talking about CPU with GTK is far beyound our capabilities. We just say "load this image" when you can and gtk loads it for foo seconds, or we say "this image is too large, better not load it"

about expat: I meant that I cannot beleive expact to have such limitaion and it was true. The problem was not in expact, but in xmpppy implementation of it - class NodeParser?.

I have a roster of about 100 contacts and I don't load avatars and Gajim works as fast as psi, so we should focus on this. Is there a way to load big image progressively?

see ./src/gtkgui_helpers.py lines 311 - 318

we can send the image data on small chunks, and have a callback on_load. What do you think about this idea?

Changed 4 years ago by dkirov

Here is a testcase that loads a 300+ K image in 3 different ways. The first - directly and we freeze for a moment, the second one is after 1 sec. and the third is asynchronously. This testcase is only for loading single image, I'll make another one for loading set of images.

Changed 4 years ago by dkirov

trac doesn't allow upload of large files, here it is:  http://jabber.homeunix.net/testcase_loading_images.tar.gz

Changed 4 years ago by dkirov

Test case for loading 20 images, first test is the way we do it now and second test is asynchronously.  http://jabber.homeunix.net/testcase_loading_20.tar.gz

Changed 4 years ago by asterix

ok asynch mode is better for sure ! Let's go with this one

Changed 4 years ago by dkirov

I found one more reason for the freeze - scaling of images. Again async load can help. We can tell to the loader what is the prefered size, after enough exif data has been read. In the new testcases the first example is more close to what gajim do and the freeze is for more long time. I put a single call to sha of the real image in all tests just as a proof that freeze has nothing to do with computing shas.

 http://jabber.homeunix.net/testcase_load_resize.tar.gz

 http://jabber.homeunix.net/testcase_load_resize_20.tar.gz

@nkour, @asterix: I really didn't understand the idea of 1 sec. delay or cpu negotiation, so if you can make testcases about this will be great and then we can choose the best combination of all ideas for dealing with avatars.

Changed 4 years ago by nk

if async helps even in scaling, start with async as Yann said :)

Changed 4 years ago by dkirov

If async help..- who knows? does it really?

Write who? I removed all code that have to do with avatars on my local copy and Gajim works as fast as psi and I'm happy with it, because I like my IM client to be functional and not decorative. If you want I can commit this, but I guess you probably don't. The testcases were a proof that certain ideas are not illusions and to make a comparison between all real ideas. If we test all possibilities, maybe we will prevent rewriting the whole avatar code for third time. I wouldn't have written these testcases if I knew they'll be the only uploaded ones.

Changed 4 years ago by asterix

with your latest testcase, on a slow machine:

with load now, UI is frowen during about 15 seconds.

with load async, UI is never frozen, and latest image is displayed after about 6 seconds.

I think it's clear enough :D

Changed 4 years ago by dkirov

Well, it is clear to you and me, but I didn't write this examples for my personal satisfaction and it seems that nk has some doubts.

nk, whay is that "if" condition?

As far as I can see you were the one that wrote avatars freezing code. Do you expect that each time you write some code which is freezing the gui it will be someone else due to rewrite it??? If we work this way Gajim will always stay in the experimental gap. Just imagine this scheme:

A writes some code -> 
  the code is freezing the gui -> 
     A tells B to rewrite this code, while A is writing some other code.

And what about all your comments about yeilding True and cpu negotiation. Do they have some ground, or they are just like: "Listen to me, I'll tell you". If you are too busy writing your cpu negotiation thing I think Jim++ can help you in this, for he proclaimed his experience in cpu timing #1604.

If you've just lost interest in this ticket then it is better to say it clear text and not hiding with "if" conditions.

btw. current avatar code is implemented with a modern KISS technology, while my async examples are dirty, complex and unmaintainable, so it is better to keep the code as it is now, because it is the way Gajim should work.

Changed 4 years ago by nk

what can I say? the if was not doubt, it was condition. I could have said:

provided your asyncs makes even avatar scaling faster, that's the way to go.

moreover, I already said that I've used yield True in action and I said where. I just finished exams and I don't think how attacking me helps really

anyways, so the #1 problem is scaling? I mean we solved the parsing 400 kb? and it's gtk scaling from 1000x1000 to 48x48 that is slow? I can just test how much time it takes to make maro huge avatar scale down.

heh, wtf can I say..

time python foo.py

real 0m1.241s user 0m1.040s sys 0m0.160s

in my CPU (1Ghz)

just run the attach and calm down. really!

Changed 4 years ago by dkirov

Now that is clear text, I didn't know that you still have exams to pass, because I saw some activity from you in commits and comments. So, take your time, don't rush.

--- When you finish with the examps: Try to comment the last two lines of your sample code and see the time. Next, try to find how many times image loading is done for each contact and in each occasion.

According to these timings I can say that a person with 20Ghz machine will be able to load a contact without a visible freeze.

Changed 4 years ago by nk

I just finished ;)

so, /me waiting 45 seconds on maro xml was the 1,5 sec of scaling down the avatar?

Changed 4 years ago by dkirov

Maybe you need a rest from the exam session. :)

So much for "how many times image loading is done". If you are unable or don't want to handle your own code, then please don't give orders someone else to do it. Jim++ may ask you "How could you ever think of storing avatars and vcards files?". If you want, you disagree with with him or find another way round.

This patch is the only thing I will do for solving something that is useless to me and that cannot be handled by its original authors. And with it Gajim is really as fast as psi.

Changed 4 years ago by dkirov

Changed 4 years ago by nk

  • status changed from new to closed
  • resolution set to invalid

I close this ticket as INVALID. from the time Dimitur fixed Node() thing, the problem is not there. I tested by rm vcards and avatars and re-loggin in the same servers that I used to have those problems.

thank you all for your comments and ideas.

Changed 4 years ago by nk

the problem is not there ==> the problem is not there anymore (IT IS FIXED)

Add/Change #1577 (smart+question)

Author


E-mail address and user name can be saved in the Preferences.


Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Next status will be 'needinfo'
 
Note: See TracTickets for help on using tickets.