TinTin++ Mud Client The TinTin++ message board

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
TinTin++ Mud Client

#format %w

 
Post new topic   Reply to topic    The TinTin++ message board Forum Index -> Bug Reports
View previous topic :: View next topic  
Author Message
nya



Joined: 25 Jun 2012
Posts: 39

PostPosted: Sun Jun 29, 2014 11:22 pm    Post subject: #format %w Reply with quote

I'm not sure if this is a definite bug or intended behavior, but if you have something like:

Code:
#var {test} {one two three four\nfive six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen}
#alias {test} {
   #echo {-Pre-wrap-};
   #echo {$test};
   #format {test2} {%w} {$test};
   #echo {-Post-wrap-};
   #forall {$test2[]} {#echo {$test2[&0]}};
}


And you run the above test alias, the output (with 80 columns) is:
Quote:
-Pre-wrap-
one two three four
five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen
seventeen eighteen nineteen
-Post-wrap-
one two three four
five six seven eight nine ten eleven twelve thirteen fourteen
fifteen sixteen seventeen eighteen nineteen


In other words, it's counting line one ("one two three four") as part of line two's length ("five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen") when deciding where to split line two.

One solution seems to be to modify variables.c:wrapstring(). In that, there are a pair of if statements as followed:
Code:
      if (*pti == ' ')
      {
         lis = pti;
      }

      if (col > ses->cols)


If you add the following OR'd conditions, it seems to resolve this issue - it looks like the current setup is not only aware that newlines count as whitespace (tabs seem to have the same deal, but that's probably beyond the scope of this post for now), but it's also not aware that newlines will cause the text to go to the next line.

This fix resolves both things by making newlines count as whitespace as well as making it wrap on a newline.

Code:
      if (*pti == ' ' || *pti == '\n')
      {
         lis = pti;
      }

      if (col > ses->cols || *pti == '\n')


The version I did all this in is 2.01.0.

As an aside note, with the patch, the output from the above alias (still 80 columns) is as I'd expect it to be:
Quote:
-Pre-wrap-
one two three four
five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen
seventeen eighteen nineteen
-Post-wrap-
one two three four
five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen
seventeen eighteen nineteen
Back to top
View user's profile Send private message
Scandum
Site Admin


Joined: 03 Dec 2004
Posts: 3796

PostPosted: Mon Jun 30, 2014 6:58 am    Post subject: Reply with quote

Thanks, will fix that. : )
Back to top
View user's profile Send private message Send e-mail
Slysven



Joined: 10 Apr 2011
Posts: 365
Location: As "Jomin al'Bara" in WoTMUD or Wiltshire, UK

PostPosted: Mon Jun 30, 2014 11:36 am    Post subject: Reply with quote

Um, can I ask about how TinTin++ deals with UTF-8 strings; just because a font is monospaced that does not mean each character is the same width.

Only asking 'cos looking at problems about this on another client that doesn't (yet) claim UTF-8 handling.

For instance how many spaces does "Café" take, is that 'é' a U+00E9 LATIN SMALL LETTER E WITH ACUTE or a U+00 LATIN SMALL LETTER E and a U+0301 COMBINING ACUTE ACCENT.

Those Combining Diacrital Marks can really mess text space routines out if they don't get processed correctly (Unicode Normalization followed by counting of Graphemes NOT Characters). Another gotcha: consider "baffle" that COULD be six characters or it could be "baffle" which only contains 4 "characters" (the latter contains a U+FB04 LATIN SMALL LIGATURE 'FFL') though that does take three byte to encode (0xEF 0xAC 0x84) so no net change in this case...

It does make some coding a lot harder to do right than one might think, as I am finding out the hard way. Cool
Back to top
View user's profile Send private message
Scandum
Site Admin


Joined: 03 Dec 2004
Posts: 3796

PostPosted: Mon Jun 30, 2014 6:20 pm    Post subject: Reply with quote

I'm treating all characters as same width. Primarily concerned with supporting European languages.
Back to top
View user's profile Send private message Send e-mail
Slysven



Joined: 10 Apr 2011
Posts: 365
Location: As "Jomin al'Bara" in WoTMUD or Wiltshire, UK

PostPosted: Thu Jul 03, 2014 3:09 pm    Post subject: Reply with quote

Scandum wrote:
I'm treating all characters as same width. Primarily concerned with supporting European languages

Well that's what was puzzling me - as an en-GB (English [United Kingdom]) speaker I don't encounter many accented letters but I know other languages in mainland Europe sprinkle all sorts of accents around like exotic herbs and spices in their varied cuisines.

So, in a MUD client, you could receive accented characters like that second "Café": it will appear to normal 'c' language routines to be FIVE characters but will only appear on-screen in the space occupied by FOUR "graphemes" so any routine that tries to work out how many "spaces" a string takes on the display has to consider this to properly work it out - just counting characters isn't enough. Nope
Back to top
View user's profile Send private message
Scandum
Site Admin


Joined: 03 Dec 2004
Posts: 3796

PostPosted: Sat Jul 05, 2014 7:49 am    Post subject: Reply with quote

Thought you meant the display width on screen.

You can determine the length of an UTF-8 character with some bit checks.
Back to top
View user's profile Send private message Send e-mail
Slysven



Joined: 10 Apr 2011
Posts: 365
Location: As "Jomin al'Bara" in WoTMUD or Wiltshire, UK

PostPosted: Tue Jul 29, 2014 8:30 pm    Post subject: Reply with quote

Scandum wrote:
Thought you meant the display width on screen.
Well, that also factors into things.
Scandum wrote:
You can determine the length of an UTF-8 character with some bit checks.
Agreed, if you mean by length, "how many bytes are needed to store this character"; but if this "character" interacts in some way with those either side of it - which is particulalry relevant to those diacriticals but by no means unique to them - then calculating the space taken up to display them - which is directly relevant to the OP's area of interest means extra work is needed to get it right.

In a GUI (development) environment there are most likely libraries or modules to help one works these things out, but, for a text console case, as TinTin++ is, the situation is not clear to me what is available - I initially though ICU, but now I'm not so sure whether that would help... Confused

As a "quick and dirty" solution perhaps it might be sufficient to identify the diacritical characters and the few others (non-breaking space etc.) to tweak the behaviour around them - I believe at least one well known terminal emulator did (and maybe still does) do this to appear to solve the problem which might be enough for use in "The West" but means it is useless in Arabic/Indic/Asian areas.

Possibly useful things, for anyone playing around with this:
  • miterm - a terminal emulator that DOES handle right to left and bidirectional text as well as VERTICAL text (in both RTL and LTR forms, for cjk and mongol use respectively)
  • yuedit - a unicode editor that can transcode (using uniconv library) between different Unicode/Non-unicode format.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    The TinTin++ message board Forum Index -> Bug Reports All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Get TinTin++ Mud Client at SourceForge.net. Fast, secure and Free Open Source software downloads Get TinTin++ Mud Client at SourceForge.net. Fast, secure and Free Open Source software downloads
TinTin++ Homepage

Powered by phpBB © 2001, 2002 phpBB Group