C++ primer ch3 - Strings, Vectors, and Arrays 阅读笔记
3.1 Namespace using Declarations
A Separate using Declaration Is Required for Each Name. The important part is that there must be a using declaration for each name we use, and each declaration must end in a semicolon.
C++ Primer第五版只是介绍C++11,那时候还只能用using
declarations一个个声明需要的name,但是C++17已经支持comma-separated list
in using-declaration。
https://en.cppreference.com/w/cpp/language/using_declaration
1 |
|
Headers Should Not Include using Declarations
头文件在预处理的时候会被直接复制到用#include
的那一行,所以在头文件里用using
declaration就相当于在(引用了头文件的)cpp里用,可能会意外地导致命名冲突。
3.2 Library string
Type
3.2.1 Defining and
Initializing string
s
用string literal初始化string
,literal尾部的null
character不会被复制到string
里。
copy initialization
When we initialize a variable using =, we are asking the compiler to copy initialize
1 | string s5 = "hiya"; // copy initialization |
direct initialization
when we omit the =, we use direct initialization
1 | string s6("hiya"); // direct initialization |
copy initialization with multi-value initializer
1 | string s8 = string(10, ’c’); // copy initialization; s8 is cccccccccc |
上面copy initialization其实新建了一个临时变量,相当于下面的写法:
1 | string temp(10, ’c’); // temp is cccccccccc |
3.2.2 Operations on
string
s
Reading and Writing
string
s
string
input operator (>>
)
会忽略开头的所有whitespace(空格、换行、tab)。
The string::size_type
Type
string
的size
,类型是string::size_type
。不确定具体的类型,但是一定是unsigned
。
在用for循环遍历string的时候,循环变量也最好用string::size_type
,或者用decltype和size()。
1
2
3
4
5// EN p94
// process characters in s until we run out of characters or we hit a whitespace
for (decltype(s.size()) index = 0;
index != s.size() && !isspace(s[index]); ++index)
s[index] = toupper(s[index]); // capitalize the current character
(但是以前从来没在意size_type
,都是直接用int i
循环了。)
3.2.3 Dealing with the
Characters in a string
ADVICE: USE THE C++ VERSIONS OF C LIBRARY HEADERS
C++ library包含了C library。C
library里头文件是以name.h格式命名的,在C++
library里头文件则是用cname命名的。如ctype.h
,
cctype
。
C++ library头文件里的成员(函数名、变量名之类的)都是定义在std namespace里的,而C library则不是。所以在C++程序里推荐用C++版本的library,避免命名冲突。
Processing Only Some Characters?
subscript operator (the [] operator
): The result of
using an out-of-range subscript is undefined.
Processing Every
Character? Use Range-Based for
Range-Based for
(range for statement):
1 | // EN p91 |
3.3 Library vector
Type
vector
是一个class template
,template本身不是一种类型,但是template给定元素的类型就可以生成新的类型,如vector<int>
。
编译器从template创造class和function的过程叫做instantiation(实例化)。
vector
只能存objects,也就是说它不能存reference。
3.3.1 Defining and
Initializing vector
s
vector的复制:从旧vector复制每一个元素到新vector
1 | // EN p97 |
Value Initialization
如果创建vector的时候只给定了size,那么vector里的元素会被value-initialized
:
- 如果元素是
built-in
type,那么会被初始化为0。 - 如果是
class
type,那么会被default initialized
1 | // EN p98 |
List Initializer or Element Count?
When we use curly braces, {...}, we’re saying that, if possible, we want to list initialize the object. That is, if there is a way to use the values inside the curly braces as a list of element initializers, the class will do so. Only if it is not possible to list initialize the object will the other ways to initialize the object be considered.
如果使用花括号{}
来初始化一个vector,那么编译器优先看能不能用List
initialization。如果不行,那再考虑其他的初始化方法。
1 | // EN p100 |
上面的例子,v8
正常的初始化写法应该是v8(10, "hi")
,但是这里用花括号,表明我们想用list
initializer,但是花括号里值的类型都不同,不可能是list
initializer,所以编译器尝试其他的初始化方法(此处的情况是direct
initialization)。
3.3.2 Adding Elements to a
vector
KEY CONCEPT: VECTORS GROW EFFICIENTLY
The standard requires that vector implementations can efficiently add elements at run time. Because vectors grow efficiently, it is often unnecessary—and can result in poorer performance—to define a vector of a specific size.
C++标准要求vector的实现要能在运行时高效地添加元素,所以通常没有必要在创建vector的时候指定size。
indirection operator *
https://en.cppreference.com/w/cpp/language/operator_member_access
The operand of the built-in indirection operator must be pointer to object or a pointer to function, and the result is the
lvalue referring
to the object or function to which expr points.
Programming
Implications of Adding Elements to a vector
range for 的body里不能改变正在被遍历的vector的size
3.3.3 Other vector
Operations
我们可以比较vector
s,前提是两个vector的元素类型相同。
v1 == v2
,v1 != v2
: v1 and v2 are equal if they have the same number of elements and each element in v1 is equal to the corresponding element in v2.<, <=, >, >=
: Have their normal meanings using dictionary ordering.
3.4 Introducing Iterators
Technically speaking, a string is not a container type, but string supports many of the container operations.
尽管从技术上来讲,string
不是一种container,但是container支持的很多操作string也支持。
和pointer类似,iterator也能让我们indirect access to an object。
一个valid iterator:
- 要么指向一个元素
- 要么指向container的最后一个元素之后一位(one past the last element)。
3.4.1 Using Iterators
拥有iterator的类型都会有返回iterator的(成员)函数,其中两个是begin
和end
。
begin
返回的iterator指向第一个元素。end
返回的iterator指向的位置是one past the end。通常也被叫做off-the-end iterator,或简称为 end iterator。
如果container是空的,那么begin和end返回的iterator都是off-the-end iterator。
In general, we do not know (or care about) the precise type that an iterator has.
一般来说,我们不知道,也不关心iterator确切的类型。
Iterator Operations
Dereference(解引用) invalid iterator或者off-the-end iterator都是undefined behavior。
*iter
Returns a reference to the element denoted by the iterator iter.
对iterator dereference得到的是reference!
iter1 == iter2
,iter1 != iter2
: Compares two iterators for equality (inequality). Two iterators are equal if they denote the same element or if they are the off-the-end iterator for the same container.
两个iterator相等,意味着
- 要么他们指向相同的元素
- 要么他们是同一个container的off-the-end iterator
Moving Iterators from One Element to Another
++iter
Increments iter to refer to the next element in the container.--iter
Decrements iter to refer to the previous element in the container.
end返回的iterator不指向任何元素,所以不能对它++
,也不能dereference。(但是可以--
)。
Iterator Types
std library的container定义了它的iterator的类型:iterator
and const_iterator
1 | // EN p108 |
The begin
and
end
Operations
对于non
const的container,也可以用cbegin
,cend
获得const_iterator
:
1 | // EN p109 |
Combining Dereference and Member Access
it->mem
is a synonym for(*it).mem
Some vector Operations Invalidate Iterators
任何改变vector size的操作都会使所有iterator失效(invalid)。
3.4.2 Iterator Arithmetic
string和vector的iterator支持一些额外的操作:
iter + n
,iter - n
: Adding (subtracting) an integral value n to (from) an iterator yields an iterator that many elements forward (backward) within the container. The resulting iterator must denote elements in, or one past the end of, the same container.iter1 += n
,iter1 -= n
: Compound-assignment for iterator addition and subtraction. Assigns to iter1 the value of adding n to, or subtracting n from, iter1.iter1 - iter2
: Subtracting two iterators yields the number that when added to the right-hand iterator yields the left-hand iterator. The iterators must denote elements in, or one past the end of, the same container.>, >=, <, <=
: Relational operators on iterators. One iterator is less than another if it refers to an element that appears in the container before the one referred to by the other iterator. The iterators must denote elements in, or one past the end of, the same container.
两个iterator相减,或者要比较两个iterator的大小,他们必须是来自同一个container。
两个iterator相减,结果的类型是difference_type
,是一种singed integral
。
3.5 Arrays
3.5.1 Defining and Initializing Built-in Arrays
Array是一种compound type
。
dimension指定了array的元素个数。而array的元素个数是array类型的一部分。因此dimension必须在编译时就已知,也就是说,dimension必须是constant expression。
在函数体中声明的array,如果没有初始化,会被default-initialized。array中的元素的值是undefined values。
不能用auto
创建array。
array只能存object,也就是说不能存reference。
Explicitly Initializing Array Elements
1 | // EN p114 |
声明array的时候dimension可以不填,但是这样就必须给初始值(list initialize),编译器可以推断array的size。
如果声明array时给的dimension比initializer list里元素的个数更多,那么是剩下的元素会被value initialized。
Character Arrays Are Special
如果用string literal来初始化char array,那么在数组末尾会增加一个null
character(\0
)。
1 | // EN p114 |
No Copy or Assignment
不能用一个array去初始化另一个array. 有一些编译器可能支持array assignment,但是并不是标准规定的特性。
1 | int a[] = {0, 1, 2}; // array of three ints |
Understanding Complicated Array Declarations
看螺旋法则吧(Clockwise/Spiral Rule)(https://c-faq.com/decl/spiral.anderson.html)
There are three simple steps to follow:
Starting with the unknown element, move in a spiral/clockwise direction; when ecountering the following elements replace them with the corresponding english statements:
[X] or []
=> Array X size of... or Array undefined size of...
(type1, type2)
=> function passing type1 and type2 returning...
- => pointer(s) to...
Keep doing this in a spiral/clockwise direction until all tokens have been covered.
Always resolve anything in parenthesis first!
Example #1: Simple declaration
Question we ask ourselves: What is
str
?
str
is an...We move in a spiral clockwise direction starting with
str
and the first character we see is a[
so, that means we have an array, so...`str` is an array 10 of...
Continue in a spiral clockwise direction, and the next thing we encounter is the
*
so, that means we have pointers, so...`str` is an array 10 of pointers to...
Continue in a spiral direction and we see the end of the line (the
;
), so keep going and we get to the typechar
, so...`str` is an array 10 of pointers to `char`
We have now "visited" every token; therefore we are done!
Example #2: Pointer to Function declaration
Question we ask ourselves: What is
fp
?`fp` is a... Moving in a spiral clockwise direction, the first thing we see is a `)`; therefore, fp is inside parenthesis, so we continue the spiral inside the parenthesis and the next character seen is the `*`, so... `fp` is a pointer to... We are now out of the parenthesis and continuing in a spiral clockwise direction, we see the `(`; therefore, we have a function, so... `fp` is a pointer to a function passing an int and a pointer to float returning... Continuing in a spiral fashion, we then see the `*` character, so... `fp` is a pointer to a function passing an int and a pointer to float returning a pointer to... Continuing in a spiral fashion we see the `;`, but we haven't visited all tokens, so we continue and finally get to the type `char`, so... `fp` is a pointer to a function passing an int and a pointer to float returning a pointer to a `char`
如果把*fp外的括号去掉会变成什么呢?
char **fp( int, float *);
fp就变成了:fp是一个function,这个function返回一个指针,这个指针指向一个指向char的指针
fp is a function (int, float *) returning a pointer to a pointer to char
3.5.2 Accessing the Elements of an Array
用subscript访问array的元素,subscript的变量的类型是size_t
,是一种unsigned
类型。
3.5.3 Pointers and Arrays
当我们用array的时候,编译器通常会把array转换成pointer。
when we use an object of array type, we are really using a pointer to the first element in that array.
1 | // EN p117 |
当我们用array作为初始值,用auto定义一个变量的时候,编译器推断变量的类型是pointer。
1 | // EN p117 |
但是如果用decltype
,推断的类型不会发生转换,还是array。
1 | // EN p118 |
Pointers Are Iterators
指向array的元素的pointer,支持和vector、string的iterator相同的操作(increment、decrement...)
The Library begin and end Functions
1 | // EN p118 |
当然也和iterator一样,不能dereference或者increment off-the-end pointer
Pointer Arithmetic
pointer也可以用3.4.1和3.4.2里对iterator定义的操作。
两个pointer相减得到的结果的类型是ptrdiff_t
,是一种signed integral
null pointer和指向非array的object的pointer,也可以用这些操作,虽然现在看起来好像没什么用。
Subcripts and Pointers
可以看到array和pointer联系非常紧密。而且可以对pointer用subscript,subscript还可以是负值。
array用的是built-in subscript operator,但是vector和string用的是他们类定义的operator,前者可以是负数,后者不行。
1 | int ia[] = {0,2,4,6,8}; // array with 5 elements of type int |
3.5.4 C-Style Character Strings
Although C++ supports C-style strings, they should not be used by C++ programs. C-style strings are a surprisingly rich source of bugs and are the root cause of many security problems. They’re also harder to use!
不推荐在C++程序里使用C-style字符串。
C-stlye
string是一个convention,这个convention是指怎样去表示和使用character
strings。遵循这个convention的string,使用character array,并以null
character (\0
)结尾。
C Library String Functions
strlen(p)
Returns the length of p, not counting the null.strcmp(p1, p2)
Compares p1 and p2 for equality. Returns 0 if p1 == p2, a positive value if p1 > p2, a negative value if p1 < p2.strcat(p1, p2)
Appends p2 to p1. Returns p1.strcpy(p1, p2)
Copies p2 into p1. Returns p1.
如果一个char
array的结尾不是null的话,用strlen的结果是undefined
Comparing Strings
比较C-style string要用strcmp
使用普通的relational或者equality操作符(> < = !=
...)去比较两个char
array的话,相当于比较两个不相关的pointer,结果是undefined
。
Exercise 3.37: What does the following program do? 1
2
3
4
5
6const char ca[] = {'h', 'e', 'l', 'l', 'o'};
const char *cp = ca;
while (*cp) {
cout << *cp << endl;
++cp;
}
'\0' null terminator,实际上\
开头的escape
sequence。\
可以接上1至3个八进制数字,表示的是一个字符的数值。
Some examples (assumingthe Latin-1 character set):
\7 (bell) \12 (newline) \40 (blank)
\0 (null) \115 (’M’) 4d (’M’)
\0
是null,数值上正好是0。所以如果用在字符串末尾,然后用指针遍历字符串,当遍历到\0
的时候,因为数值为0就是false,循环就停止了。
3.5.5 Interfacing to Older Code
Mixing Library strings and C-Style Strings
可以用string
literal初始化一个string
。可以通过c_str方法得到一个C-style
string。但是c_str返回的array并不会永远都是valid,当原本的string内容改变了,之前返回的array就失效了。
1 | string s("Hello World"); // s holds Hello World |
Using an Array to Initialize a vector
1 | // EN p125 |
ADVICE: USE LIBRARY TYPES INSTEAD OF ARRAYS
Modern C++ programs should use vectors and iterators instead of built-in arrays and pointers, and use strings rather than C-style array-based character strings.
3.6 Multidimensional Arrays
subscript的数量没有限制
1 | int arr[10][20][30] = {0}; // initialize all elements to 0 |
a two-dimensional array, the first dimension is usually referred to as the row and the second as the column.
Initializing the Elements of a Multidimensional Array
多维数组的初始化,可以用多层list,也可以不用
1 | // EN p 126 |
但是如果每一行只想初始化部分元素的话,还是要用多层list,理所当然,不然的话就只是初始化前几个元素了。
1 | // explicitly initialize only element 0 in each row |
Subscripting a Multidimensional Array
如果提供的subscript数量少于定义数组时给的dimension,那么得到的会是内部的数组。
1 | // EN p127 |
Using a Range
for
with Multidimensional Arrays
如果使用range based for遍历多维数组,那么除了最内层的array,外层的for循环都必须用reference
1 | for (auto &row : ia) // for every element in the outer array |
如果外层for不用reference的话会怎样呢。因为range for其实是个语法糖,编译器会转换成一个普通的for循环
1 | for (auto row : ia) |
begin(ia)
得到的是一个指向ia的第一个元素的指针,所以类型是一个指向有size为4的array的指针:int (*p)[4]
。
而对*beg
得到的就是一个reference,这个reference绑定的obect是一个size为4的array。
我们知道auto会忽略reference,而使用array相当于使用一个指向array第一个元素的指针,所以auto row = *beg;
会让row的类型变成一个指向int的指针。
关于range based for更多内容,参考我另一篇range based for的笔记。
Pointers and Multidimensional Arrays
a multidimensional array is really an array of arrays, the pointer type to which the array converts is a pointer to the first inner array:
1 | // EN p129 |
感想
这一章讲数组的内容占了很大篇幅,但是工作中几乎没见过用数组的,至少也是vector。可能在嵌入式里用的比较多? 而且数组很多情况下会被当成指针,多维数组就更复杂,感觉很容易出错,也不易读。书里也说了现代c++程序推荐用vector、iterator和string。