Postgres COPY FROM csv file- No such file or directory

  • I'm trying to import a (rather large) .txt file into a table geonames in PostgreSQL 9.1. I'm in the /~ directory of my server, with a file named US.txt placed in that directory. I set the search_path variable to geochat, the name of the database I'm working in. I then enter this query:

    COPY geonames
    FROM 'US.txt',
    DELIMITER E'\t',
    NULL 'NULL');

    I then receive this error:

    ERROR: could not open file "US.txt" for reading: No such file or directory.

    Do I have to type in \i US.txt or something similar first, or should it just get it from the present working directory?

  • A couple of misconceptions:


    I'm in the /~ directory of my server

    There is no directory /~. It's either / (root directory) or ~ (home directory of current user). It's also irrelevant to the problem.


    I set the search_path variable to geochat, the name of the database I'm working in

    The search_path has nothing to do with the name of the database. It's for schemas inside the current database. You probably need to reset this.

    You are required to use the absolute path for your file. As documented in the manual here:


    The absolute path name of the input or output file.

    DELIMITER: just noise.

    The default is a tab character in text format

    NULL: It's rather uncommon to use the actual string 'NULL' for a NULL value. Are you sure?

    The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format.

    My guess (after resetting search_path - or you schema-qualify the table name):

    COPY geonames FROM '/path/to/file/US.txt';

  • Maybe a bit late, but hopefully useful:

    Use \copy instead


  • The paths are relative to the PostgreSQL server, not the psql client.

    Assuming you are running PostgreSQL 9.4, you can put US.txt in the directory /var/lib/postgresql/9.4/main/.

  • if you're running your COPY command from a script, you can have a step in the script that creates the COPY command with the correct absolute path.

    echo "COPY geonames FROM '$MYPWD/US.txt', DELIMITER E'\t';"

    you can then run this portion into a file and execute it

    ./ >COPY_abs_path.sql
    psql -f COPY_abs_path.sql -d your_db_name



  • S

    dplyr/tidyr option would be to group_by``V1 and V2 select the maximum value in each group and then spread to wide format.

    library(dplyr) library(tidyr) DT %>% group_by(V1, V2) %>% slice(which.max(V4)) %>% select(-V4) %>% spread(V2, V3) # V1 `1` `2` # <chr> <chr> <chr> #1 a B cc #2 b st st #3 c cc NA

    read more
  • S

    Here is one option where you take the relevent subset before dcasting:

    DT[order(V4, decreasing = TRUE) ][, dcast(unique(.SD, by = c("V1", "V2")), V1 ~ V2, value.var = "V3")] # V1 1 2 # 1: a B cc # 2: b st st # 3: c cc <NA>

    Alternatively order and use a custom function in dcast():

    dcast( DT[order(V4, decreasing = TRUE)], V1 ~ V2, value.var = "V3", fun.aggregate = function(x) x[1] )

    read more
  • S

    I've got a data.table that i'd like to dcast based on three columns (V1, V2, V3). there are, however, some duplicates in V3 and I need an aggregate function that looks at a fourth column V4 and decides for the value of V3 based on maximum value of V4. I'd like to do this without having to aggregate DT separately prior to dcasting. can this aggregation be done in aggregate function of dcast or do I need to aggregate the table separately first?

    Here is my data.table DT:

    > DT <- data.table(V1 = c('a','a','a','b','b','c') , V2 = c(1,2,1,1,2,1) , V3 = c('st', 'cc', 'B', 'st','st','cc') , V4 = c(0,0,1,0,1,1)) > DT V1 V2 V3 V4 1: a 1 st 0 2: a 2 cc 0 3: a 1 B 1 ## --> i want this row to be picked in dcast when V1 = a and V2 = 1 because V4 is largest 4: b 1 st 0 5: b 2 st 1 6: c 1 cc 1

    and the dcast function could look something like this:

    > dcast(DT , V1 ~ V2 , value.var = "V3" #, fun.aggregate = V3[max.which(V4)] ## ?!?!?!??! )

    My desired output is:

    > desired V1 1 2 1: a B cc 2: b st st 3: c cc <NA>

    Please note that aggregating DT before dcasting to get rid of the duplicates will solve the issue. I'm just wondering if dcasting can be done with the duplicates.

    read more
  • S

    The input image is enormous.

    Even 11GB of Video Ram (GTX 1080 Ti for example) is not enough to handle such a size.

    2GB GPU is really not enough for deep learning, especially for computer vision.

    First of all, please use a GPU with at least 4GB of VRAM. 6GB of VRAM would be a good starting point, 8GB of VRAM would be more suitable.

    Second of all, try to reduce the initial image size (3584x2796) to (448x350). Practically you divide both the height and the width by 8.

    This is not about Keras optimization, but about the memory size of your GPU and the huge size of the input image.

    read more
  • S

    I want to get inference on 3584x2796 size of 2D image on a 2 Gb GPU using Keras. Is it possible? network architecture is an Unet.

    The out of memory error comes all the time during inference. Even the convolution, only for the first layer fails. It seems Keras is not optimized for large image convolutions. Even the network below fails during inference on a 2 Gb GPU.

    Layer (type) Output Shape Param # ================================================================= input_3 (InputLayer) (None, 3584, 2796, 1) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 3584, 2796, 32) 80 _________________________________________________________________ conv2d_5 (Conv2D) (None, 3584, 2796, 32) 584 _________________________________________________________________ instance_normalization_3 (In (None, 3584, 2796, 32) 2 _________________________________________________________________ activation_3 (Activation) (None, 3584, 2796, 32) 0 ================================================================= Total params: 666 Trainable params: 666 Non-trainable params: 0

    I get below error:

    OOM when allocating tensor with shape[1,32,3584,2796] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node conv2d_2/convolution}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv2d_2/convolution-0-TransposeNHWCToNCHW-LayoutOptimizer, conv2d_2/kernel/read)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [[{{node activation_2/Relu/_11}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_47_activation_2/Relu", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

    read more