Transfer Files From Sftp To S3

broken image


Mounting Bucket to Linux Server. Just mount the bucket using s3fs file system (or similar) to a Linux server (e.g. Amazon EC2) and use the server's built-in SFTP server to access the bucket. Install the s3fs; Add your security credentials in a form access-key-id:secret-access-key to /etc/passwd-s3fs; Add a bucket mounting entry to fstab: /mnt/ fuse.s3fs rw,nosuid,nodev,allow. FTP/S, SFTP, Dropbox, Google Drive, Amazon S3, Azure Blob, Box. Various file transfer modes. You can backup, move and synchronize files easily between any two of.

Editor midi gratis. I am a beginner in using Boto3 and I would like to transfer a file from an S3 bucket to am SFTP server directly.

My final goal is to write a Python script for AWS Glue.

I have found some article which shows how to transfer a file from an SFTP to an S3 bucket:
https://medium.com/better-programming/transfer-file-from-ftp-server-to-a-s3-bucket-using-python-7f9e51f44e35

Unfortunately I can't find anything which does the opposite action. Do you have any suggestions/ideas?

My first wrong attempt is below.

But I would like to avoid downloading while file to my local memory in order to move it then to SFTP.

How to transfer files using SCP, SFTP, Fuse and S3

Some people are choosing to move files via the web console since you are only asked for your Duo authentication when first logging in and then can keep the window open to transfer files selectively. To learn more about this option take a look at the documentation for Open OnDemand
File

Why use SSH/SCP/SFTP for file transfer?

SCP and SFTP both run over ssh and are thus encrypted. There are implementations available for all common operating systems including Linux, Windows, and Mac OS X.

Windows

GUI:
  • WinSCP
    • Host: login.hpc.caltech.edu
    • Enter your username and password.
  • FileZilla
Command Line:
  • pscp, psftp - part of the PUTTY tools

Linux

Command Line:
  • Start Terminal (Applications->Accessories->Terminal)
    • To copy files from your computer to the central cluster
      • Type scp local_filename username@login.hpc.caltech.edu:~/username/
    • To copy files from to the central cluster to your computer
      • Type scp username@login.hpc.caltech.edu:/home/username/remote_filename .

Mac OS X

Command Line:
  • Start Terminal (Applications->Utilities->Terminal)
    • To copy files from your computer to the central cluster
      • Type scp local_filename username@login.hpc.caltech.edu:~/username/
    • To copy files from the central cluster to your computer
      • Type scp username@login.hpc.caltech.edu:~/username/remote_filename
SSHFS on Mac OS X
If you prefer filesystem like access you may use FuseOS together with SSHFS. This works over SSH protocol and is therefore encrypted as with standard SSH/SCP/SFTP but with the added benefit of drag and drop transfers.
  • Download and install FUSE and SSHFS here.
  • Make a local mount directory on your Mac. mkdir ~/Desktop/HPC-Mount
  • Run a command similar to the following, swapping out your username and directory name.
  • sshfs -o allow_other,defer_permissions,auto_cache remote-username@login.hpc.caltech.edu:/home/remote-username ~/Desktop/HPC-Mount
GUI:
  • Cyberduck.
    • Cyberduck can be made to work with 2 factor
      • Click on 'Open Connection'
      • choose 'SFTP'
      • enter you username and password, then click connect
      • In the 'Provide additional login credentials' box, enter 1 in the password field and hit enter if using the smartphone app.
      • You should be prompted on you cell phone to allow the connection
      • If using a yubikey, you can touch it when prompted to complete the login.
Globus is a fast, reliable file transfer service that makes it easy for users to move data between two GridFTP servers or between a GridFTP server and a user's machine (Windows, Mac or Linux).
Setup the Globus endpoint via their website.
  1. Go to this address to setup a new globus endpoint https://app.globus.org/file-manager/gcp
  2. Select California Institute of Technology > Continue
  3. Sign in with your access.caltech credentials
  4. Set the Endpoint display name to 'central-hpc' (or something similar)
  5. Click 'Generate setup key' and copy that to a secure location.
Setup the Globus Personal client under your account on the Central HPC.
SSH to the Central HPC then run the following.
  1. module load globusconnectpersonal/3.0.2
  2. globusconnectpersonal -setup
  3. globusconnectpersonal -start &
The daemon should now be running in the background and connected to the external Globus service. You should be able to browse your home directory and transfer data to and from it.

Transfer Files Using Sftp

If you need to allow Globus access to another directory (for instance /central/groups/xxx) perform the following.
Edit ~/.globusonline/lta/config-paths, adding the following line after the existing one. The one in the configuration below sets the directory as read/write. Setting to 0 will set the directory to read-only in Globus.
/central/groups//,0,1
Restart the Globus daemon to pickup the changes.
  1. globusconnectpersonal -stop
  2. globusconnectpersonal -start &
You should now be able to navigate to the additional directory via the Globus website. Keep in mind that even though you've added a directory to allow globus access, existing unix permissions will determine what files and directories you have access to on the cluster.
If your data is in Amazon S3 you may use the awscli tools which are already installed as a module on the cluster.

C# Sftp File Transfer

Transfer files from sftp to s3 e1

Why use SSH/SCP/SFTP for file transfer?

SCP and SFTP both run over ssh and are thus encrypted. There are implementations available for all common operating systems including Linux, Windows, and Mac OS X.

Windows

GUI:
  • WinSCP
    • Host: login.hpc.caltech.edu
    • Enter your username and password.
  • FileZilla
Command Line:
  • pscp, psftp - part of the PUTTY tools

Linux

Command Line:
  • Start Terminal (Applications->Accessories->Terminal)
    • To copy files from your computer to the central cluster
      • Type scp local_filename username@login.hpc.caltech.edu:~/username/
    • To copy files from to the central cluster to your computer
      • Type scp username@login.hpc.caltech.edu:/home/username/remote_filename .

Mac OS X

Command Line:
  • Start Terminal (Applications->Utilities->Terminal)
    • To copy files from your computer to the central cluster
      • Type scp local_filename username@login.hpc.caltech.edu:~/username/
    • To copy files from the central cluster to your computer
      • Type scp username@login.hpc.caltech.edu:~/username/remote_filename
SSHFS on Mac OS X
If you prefer filesystem like access you may use FuseOS together with SSHFS. This works over SSH protocol and is therefore encrypted as with standard SSH/SCP/SFTP but with the added benefit of drag and drop transfers.
  • Download and install FUSE and SSHFS here.
  • Make a local mount directory on your Mac. mkdir ~/Desktop/HPC-Mount
  • Run a command similar to the following, swapping out your username and directory name.
  • sshfs -o allow_other,defer_permissions,auto_cache remote-username@login.hpc.caltech.edu:/home/remote-username ~/Desktop/HPC-Mount
GUI:
  • Cyberduck.
    • Cyberduck can be made to work with 2 factor
      • Click on 'Open Connection'
      • choose 'SFTP'
      • enter you username and password, then click connect
      • In the 'Provide additional login credentials' box, enter 1 in the password field and hit enter if using the smartphone app.
      • You should be prompted on you cell phone to allow the connection
      • If using a yubikey, you can touch it when prompted to complete the login.
Globus is a fast, reliable file transfer service that makes it easy for users to move data between two GridFTP servers or between a GridFTP server and a user's machine (Windows, Mac or Linux).
Setup the Globus endpoint via their website.
  1. Go to this address to setup a new globus endpoint https://app.globus.org/file-manager/gcp
  2. Select California Institute of Technology > Continue
  3. Sign in with your access.caltech credentials
  4. Set the Endpoint display name to 'central-hpc' (or something similar)
  5. Click 'Generate setup key' and copy that to a secure location.
Setup the Globus Personal client under your account on the Central HPC.
SSH to the Central HPC then run the following.
  1. module load globusconnectpersonal/3.0.2
  2. globusconnectpersonal -setup
  3. globusconnectpersonal -start &
The daemon should now be running in the background and connected to the external Globus service. You should be able to browse your home directory and transfer data to and from it.

Transfer Files Using Sftp

If you need to allow Globus access to another directory (for instance /central/groups/xxx) perform the following.
Edit ~/.globusonline/lta/config-paths, adding the following line after the existing one. The one in the configuration below sets the directory as read/write. Setting to 0 will set the directory to read-only in Globus.
/central/groups//,0,1
Restart the Globus daemon to pickup the changes.
  1. globusconnectpersonal -stop
  2. globusconnectpersonal -start &
You should now be able to navigate to the additional directory via the Globus website. Keep in mind that even though you've added a directory to allow globus access, existing unix permissions will determine what files and directories you have access to on the cluster.
If your data is in Amazon S3 you may use the awscli tools which are already installed as a module on the cluster.

C# Sftp File Transfer

  • Log into the cluster and run module load awscli/1.15.27
  • Type aws configure and enter your Amazon Web Services API key and private key. (You generate these in the IAM credential page in the AWS console).
  • Run a command similar to the following to copy data from S3 to your cluster home directory.
  • aws s3 cp --recursive s3://my-bucket-name/subfolder/ ~/destination-directory/
  • Run a command similar to the following to copy data from the cluster to a pre-existing S3 bucket.
  • aws s3 cp --recursive ~/source-directory/ s3://my-bucket-name/subfolder/
  • More s3 examples are available here.
If your data is in Google Cloud Storage you may use the gsutil which is installed as a module on the cluster.

Sftp Send File

  • Log into the cluster and run module load python/2.7.15 gcloud/latest
  • Run gcloud auth login to configure the Google SDK for your GCP account if needed.

  • Run following command to copy data from the cluster to Google Cloud Storage.

  • gsutil cp ~/kitten.png gs://my-awesome-bucket

  • More gsutil examples here.





broken image