If I choose consistency and availability, this means that I can't have partition tolerance. What does this mean? Does this mean that the whole system has to be shut down? If so, I won't have availability either? Is this a contradiction?
I know that when there is no partition, you can have all three.
The CAP theorem is a theoretical reasoning on the cluster guarantees when one or more of its nodes get isolated from the rest. E.g. if one node is unable to reach the rest of the cluster it has three options: 1) respond to any received request thereby guaranteeing A but not C; 2) do not respond thereby ensuring C instead of A; 3) shutdown before receiving any request to terminate the partition.
Regarding the latter, you sacrificed P because you avoided the partition instead of tolerating it, i.e. the cluster minus the node that shuts down becomes CA - at least until the next partition occurs in which point more node(s) will need to shutdown. Since you can't really prevent partitions from happening this model converges to the scenario in which the cluster has only one node - in such case there is no "rest of the cluster" to get isolated from hence it is always CA.
In practice, if you want a cluster of nodes you do not want a one-node cluster, as plural is kind of the whole point, hence you are unlikely to find many CA alternatives and end up choosing between C and A.
See this answer for better examples.