[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: funny characters
>hi all stateside
>
>anybody got an idea why:-
>
>i open under zindoze notepad
>
>
>eh voila there is three little charcters black filled squares
>i can cut and copy etc but if i delete them save and reopen the file
>they're back
>so i look at it with xedit under linux
>it shows the follwing ----- ^M
>delte it it, then save and reopen voila gone!!!
> what is it ?????
>sorry if this question is a bit thick
>but 7 yrs of windows has made a blind man of me when it comes
>to this sort of thing thanx
>oh yeh "we won the cup"
DOS-based text files are defined (by convention) to have records, or rows,
seperated by the 2 character ASCII sequence CR-LF (or Carriage-Return,
Linefeed). UNIX and UNIX-like systems (or which Linux is one), seperate
records with only the Linefeed character.
If you move a DOS file to a UNIX system (without doing some sort of
conversion on it), the "extra" character (the CR) is often represented
in UNIX editors by the characters ^M, or Control-M, another way of pro-
ducing a CR. Go to a Linux console or an xterm shell, hold down the
control (or Ctrl) key, and press m, and you'll see it produces a car-
riage return.
On moving a file from a UNIXX to a DOS-based system, there are no
records! That is, since the CR-LF sequence appears nowhere, DOS
applications have no way to figure out where records start. Some
graphical editors in these systems will take all the non-printable
characters (LF is one), and place a black bullet character (I don't
know the value offhand) to show that SOMETHING is there, but it can't
be displayed.
The solution is to run conversion programs in each direction, in going
DOS->UNIX, you need a program that removes that CR, and while going
UNIX->DOS, you need one that puts it back. explicitly these programs
exist (I've written a pair, but I do weird things with files!), but
most of the time you don't need them. When you use an ftp client program
to move data, it will perform this removal or insertion when you're in
ASCII mode, and you can always run the ftp client/server components on
your own machine, even when not connected to a network (ftp localhost).
A final point: there are some packages in the DOS/Windows environment
(one that comes to mind is Wordperfect) that does not follow the DOS
conventions in a strict sense, as it recognizes LF-only files as having
distinct records. It's possible to load a "DOS native" file that came
from a UNIX system into Wordperfect, then save it as a "DOS native file",
and have the CR's inserted. I don't know if this behavior holds of Unix
releases of WP, but I suspect it does (others, like Adam, can comment on
this). I also expect that WP is not unique in this regard.
---> RGB <---