212 lines
5.6 KiB
Groff
212 lines
5.6 KiB
Groff
|
.\" $NetBSD: xstr.1,v 1.18 2005/09/11 23:29:44 wiz Exp $
|
||
|
.\"
|
||
|
.\" Copyright (c) 1980, 1993
|
||
|
.\" The Regents of the University of California. All rights reserved.
|
||
|
.\"
|
||
|
.\" Redistribution and use in source and binary forms, with or without
|
||
|
.\" modification, are permitted provided that the following conditions
|
||
|
.\" are met:
|
||
|
.\" 1. Redistributions of source code must retain the above copyright
|
||
|
.\" notice, this list of conditions and the following disclaimer.
|
||
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
||
|
.\" notice, this list of conditions and the following disclaimer in the
|
||
|
.\" documentation and/or other materials provided with the distribution.
|
||
|
.\" 3. Neither the name of the University nor the names of its contributors
|
||
|
.\" may be used to endorse or promote products derived from this software
|
||
|
.\" without specific prior written permission.
|
||
|
.\"
|
||
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
||
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
||
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
||
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||
|
.\" SUCH DAMAGE.
|
||
|
.\"
|
||
|
.\" @(#)xstr.1 8.2 (Berkeley) 12/30/93
|
||
|
.\"
|
||
|
.Dd July 23, 2004
|
||
|
.Dt XSTR 1
|
||
|
.Os
|
||
|
.Sh NAME
|
||
|
.Nm xstr
|
||
|
.Nd "extract strings from C programs to implement shared strings"
|
||
|
.Sh SYNOPSIS
|
||
|
.Nm
|
||
|
.Op Fl cv
|
||
|
.Op Fl l Ar array
|
||
|
.Op Fl
|
||
|
.Op Ar
|
||
|
.Sh DESCRIPTION
|
||
|
.Nm
|
||
|
maintains a file
|
||
|
.Pa strings
|
||
|
into which strings in component parts of a large program are hashed.
|
||
|
These strings are replaced with references to this common area.
|
||
|
This serves to implement shared constant strings, most useful if they
|
||
|
are also read-only.
|
||
|
.Pp
|
||
|
Available options:
|
||
|
.Bl -tag -width XXlXarrayXX
|
||
|
.It Fl
|
||
|
.Nm
|
||
|
reads from the standard input.
|
||
|
.It Fl c
|
||
|
.Nm
|
||
|
will extract the strings from the C source
|
||
|
.Ar file
|
||
|
or the standard input
|
||
|
.Pq Fl ,
|
||
|
replacing
|
||
|
string references by expressions of the form (\*[Am]xstr[number])
|
||
|
for some number.
|
||
|
An appropriate declaration of
|
||
|
.Nm
|
||
|
is prepended to the file.
|
||
|
The resulting C text is placed in the file
|
||
|
.Pa x.c ,
|
||
|
to then be compiled.
|
||
|
The strings from this file are placed in the
|
||
|
.Pa strings
|
||
|
data base if they are not there already.
|
||
|
Repeated strings and strings which are suffixes of existing strings
|
||
|
do not cause changes to the data base.
|
||
|
.It Fl l Ar array
|
||
|
Specify the named array in program references to abstracted
|
||
|
strings.
|
||
|
The default array name is xstr.
|
||
|
.It Fl v
|
||
|
Be verbose.
|
||
|
.El
|
||
|
.Pp
|
||
|
After all components of a large program have been compiled, a file
|
||
|
.Pa xs.c
|
||
|
declaring the common
|
||
|
.Nm
|
||
|
space can be created by a command of the form:
|
||
|
.Pp
|
||
|
.Dl $ xstr
|
||
|
.Pp
|
||
|
The file
|
||
|
.Pa xs.c
|
||
|
should then be compiled and loaded with the rest
|
||
|
of the program.
|
||
|
If possible, the array can be made read-only (shared) saving
|
||
|
space and swap overhead.
|
||
|
.Pp
|
||
|
.Nm
|
||
|
can also be used on a single file.
|
||
|
The following command creates files
|
||
|
.Pa x.c
|
||
|
and
|
||
|
.Pa xs.c
|
||
|
as before, without using or affecting any
|
||
|
.Pa strings
|
||
|
file in the same directory:
|
||
|
.Pp
|
||
|
.Dl $ xstr name
|
||
|
.Pp
|
||
|
It may be useful to run
|
||
|
.Nm
|
||
|
after the C preprocessor if any macro definitions yield strings
|
||
|
or if there is conditional code which contains strings
|
||
|
which may not, in fact, be needed.
|
||
|
An appropriate command sequence for running
|
||
|
.Nm
|
||
|
after the C preprocessor is:
|
||
|
.Pp
|
||
|
.Bd -literal -offset indent
|
||
|
$ cc \-E name.c | xstr \-c \-
|
||
|
$ cc \-c x.c
|
||
|
$ mv x.o name.o
|
||
|
.Ed
|
||
|
.Pp
|
||
|
.Nm
|
||
|
does not touch the file
|
||
|
.Pa strings
|
||
|
unless new items are added, thus
|
||
|
.Xr make 1
|
||
|
can avoid remaking
|
||
|
.Pa xs.o
|
||
|
unless truly necessary.
|
||
|
.Sh FILES
|
||
|
.Bl -tag -width /tmp/xsxx* -compact
|
||
|
.It Pa strings
|
||
|
Data base of strings
|
||
|
.It Pa x.c
|
||
|
Massaged C source
|
||
|
.It Pa xs.c
|
||
|
C source for definition of array `xstr'
|
||
|
.It Pa /tmp/xs*
|
||
|
Temp file when `xstr name' doesn't touch
|
||
|
.Pa strings
|
||
|
.El
|
||
|
.Sh SEE ALSO
|
||
|
.Xr mkstr 1
|
||
|
.Sh HISTORY
|
||
|
The
|
||
|
.Nm
|
||
|
command appeared in
|
||
|
.Bx 3.0 .
|
||
|
.Sh BUGS
|
||
|
If a string is a suffix of another string in the data base,
|
||
|
but the shorter string is seen first by
|
||
|
.Nm
|
||
|
both strings will be placed in the data base, when just
|
||
|
placing the longer one there will do.
|
||
|
.Pp
|
||
|
.Nm
|
||
|
does not parse the file properly so it does not know not to process:
|
||
|
.Bd -literal
|
||
|
char var[] = "const";
|
||
|
.Ed
|
||
|
into:
|
||
|
.Bd -literal
|
||
|
char var[] = (\*[Am]xstr[N]);
|
||
|
.Ed
|
||
|
.Pp
|
||
|
These must be changed manually into an appropriate initialization for
|
||
|
the string, or use the following ugly hack.
|
||
|
.Pp
|
||
|
Also,
|
||
|
.Nm
|
||
|
cannot initialize structures and unions that contain strings.
|
||
|
Those can be fixed by changing from:
|
||
|
.Bd -literal
|
||
|
struct foo {
|
||
|
int i;
|
||
|
char buf[10];
|
||
|
} = {
|
||
|
1, "foo"
|
||
|
};
|
||
|
.Ed
|
||
|
to:
|
||
|
.Bd -literal
|
||
|
struct foo {
|
||
|
int i;
|
||
|
char buf[10];
|
||
|
} = {
|
||
|
1, { 'f', 'o', 'o', '\e0' }
|
||
|
};
|
||
|
.Ed
|
||
|
.Pp
|
||
|
The real problem in both cases above is that the compiler knows the size
|
||
|
of the literal constant so that it can perform the initialization required,
|
||
|
but when
|
||
|
.Nm
|
||
|
changes the literal string to a pointer reference, the size information is
|
||
|
lost.
|
||
|
It would require a real parser to do this right, so the obvious solution is
|
||
|
to fix the program manually to compile, or even better rely on the compiler
|
||
|
and the linker to merge strings appropriately.
|
||
|
.Pp
|
||
|
Finally,
|
||
|
.Nm
|
||
|
is not very useful these days because most of the string merging is done
|
||
|
automatically by the compiler and the linker, provided that the strings
|
||
|
are identical and read-only.
|