PyTorch란?

Dec 31, 2023

Contents

파이토치(PyTorch)란?PyTorch 시작해보기

파이토치(PyTorch)란?

PyTorch는 기계학습 프레임워크(framework) 중 하나이다.

PyTorch의 텐서(tensor)는 NumPy 배열과 유사하다.

PyTorch를 사용하면, GPU 연동을 통해 효율적으로 딥러닝 모델을 학습할 수 있다.

내장 라이브러리는 아니므로, 별도의 설치가 필요

Start Locally

https://pytorch.org/get-started/locally/

본 노션 작성자는, Windows, Conda, Python에서 사용할 것이므로 아래와 같이 설정함.

추가적으로 Stable 버젼을 선택하는 것을 추천
PyTorch를 사용하는 대부분의 사용자는 GPU를 활용하여 딥러닝 모델을 돌리기 위함으로 생각하고 자세한 CUDA 버젼 선택법은 생략하겠음

CUDA 설치 가능한 GPU 확인

NVIDIA CUDA GPUs - Compute Capability

Explore your GPU compute capability and learn more about CUDA-enabled desktops, notebooks, workstations, and supercomputers.

https://developer.nvidia.com/cuda-gpus

위 과정이 어려운 사람들은 Google Colab을 이용하면 설치없이 PyTorch를 사용할 수 있음.

PyTorch 시작해보기


import torch

tensor 만들어보기

리스트 → 텐서로 만들기

텐서(tensor)란?

텐서란, 간단히 말하자면 다차원 배열을 뜻함

기능적으로는 넘파이(numpy)와 매우 유사함

텐서는 “자동 미분”기능을 제공

이미지, 텍스트 등 일반적으로 딥러닝 모델에 입력으로 넣는 모든 데이터들은 텐서 형태로 생각할 수 있음

자세한 설명을 원하면 아래 블로그를 참조

딥러닝 필수개념: 텐서 TENSOR 이해하기

안녕하세요 IT 범생이 Finn 입니다~! 오늘은 다차원 데이터 형태 중에서도 악명이 높은 텐서 (Tensor) 데이터에 대해 정리해 보고자 합니다. 사실 텐서는 텐서플로우가 유명해지며 데이터를 공부하는 사람들 사이에서 유명해진 단어인데요. 정작 텐서가 무엇인지 모르면서 텐서플로우를 사용하는 경우가 잦은것 같아, 정리를 위해 포스팅을 하게 되었습니다. 텐서는 물리학에서도 함께 사용되는 만큼 본 포스팅은 데이터의 영역에 국한하여 최대한 쉽게 정리해 보고자 합니다. 텐서란 무엇인가? 컴퓨터 과학에서 배열 (array)는 번호 (index)와 각 번호에 대응하는 값들로 이루어진 데이터 형태를 말합니다. Numpy 혹은 Pandas를 사용해 보신 분들이라면 가장 기본적으로, 또 많이 다루어 보셨을..

https://finnsplace.tistory.com/21


data = [
	[1,2],
	[3,4]
]

# list
print(data)
>>>[[1, 2], [3, 4]]

# tensor
test_tensor = torch.tensor(data)
print(test_tensor)
>>>tensor([[1, 2],
        [3, 4]])

GPU 사용 여부 확인하기

PyTorch에서 하나의 tensor를 초기화하면 CPU 위에 있을까요? GPU 위에 있을까요?

기본적으로 CPU위에 올라가게 됨


print(test_tensor.is_cuda)
>>> False
print(test_tensor.is_cpu)
>>> True

tensor를 cuda() 메소드를 사용해 gpu로 옮길 수 있음


# test_tensor를 gpu로 넘겨줌
test_tensor = test_tensor.cuda()

# 재질문 gpu에 있니?
print(test_tensor.is_cuda)
>>> True
# cpu에 올라가 있니?
print(test_tensor.is_cpu)
>>> False

cpu()메소드를 활용해 다시 cpu로 옮길 수도 있음


# test_tensor를 다시 cpu로 넘겨줌
test_tensor = test_tensor.cpu()

# 재질문 gpu에 있니?
print(test_tensor.is_cuda)
>>> False
# cpu에 올라가 있니?
print(test_tensor.is_cpu)
>>> True

tensor간의 연산을 수행할 때, 두 tensor는 같은 장치(GPU, CPU)에 있어야 한다.


a = torch.tensor([
    [1,1],
    [2,2]
])

b = torch.tensor([
    [5,6],
    [7,8]
])

# cpu에 올라가 있는 a, b 행렬곱
print(torch.matmul(a, b))
>>>tensor([[12, 14],
        [24, 28]])

서로 다른 장치에 있는 tensor간의 연산을 수행하면 오류가 발생한다.


# a는 gpu b는 cpu
print(torch.matmul(a.cuda(), b))
>>> 오류 발생

그러므로, 연산을 수행하는 tensor들을 모두 GPU에 올린 뒤 연산을 수행한다.


gpu_a = a.cuda() # a를 gpu로 옮기기
gpu_b = b.cuda() # b를 gpu로 옮기기

# 행렬곱
print(torch.matmul(gpu_a, gpu_b))
>>>tensor([[12, 14],
        [24, 28]])

tensor의 속성

tensor의 기본 속성

모양 (shape)
자료형 (data type)
저장된 장치


# torch의 rand 함수를 통해 0~1사이의 nxn tensor를 쉽게 만들 수 있음
tensor_rand = torch.rand(3,3)

# tensor 출력
print(tensor_rand)
>>> tensor([[0.6774, 0.2543, 0.9682],
        [0.5753, 0.6983, 0.7663],
        [0.5654, 0.0140, 0.7378]])

# tensor shape(크기) 출력
print(f'Shape: {tensor_rand.shape}')
>>> Shape: torch.Size([3, 3])

# tensor dtype 출력
print(f'Data Type: {tensor_rand.dtype}')
>>> Data Type: torch.float32

# 해당 tensor가 올려져 있는 device 출력
print(f'Device: {tensor_rand.device}')
>>> Device: cpu

tensor 초기화 방법들

(1) 리스트 데이터 활용


list_data = [
    [1,2],
    [3,4]
]

tensor_data = torch.tensor(list_data)
print(tensor_data)
>>> tensor([[1, 2],
        [3, 4]])

(2) numpy 데이터 활용


import numpy as np

np_data = np.random.rand(3,3)     # 3x3 크기의 배열을 랜덤 값으로 초기화
print(type(np_data))        # 자료형 출력
>>> <class 'numpy.ndarray'>

print(np_data)              # np 값 출력
>>> [[0.48532938 0.17477689 0.23010243]
 [0.58407465 0.12204853 0.69778234]
 [0.68498744 0.37607972 0.91400143]]

tensor_data = torch.from_numpy(np_data) # from_numpy 메소드 활용해 numpy data를 tensor로 초기화
print(type(tensor_data))
>>> <class 'torch.Tensor'>

print(tensor_data)
>>> ensor([[0.4853, 0.1748, 0.2301],
        [0.5841, 0.1220, 0.6978],
        [0.6850, 0.3761, 0.9140]], dtype=torch.float64)

(3) 다른 tensor 데이터 활용


tensor_data = torch.tensor([
    [1, 2],
    [3, 4]
])

# tensor_data와 동일한 크기의 자료형이지만 값이 전부 0인 텐서 생성
new_tensor_data = torch.zeros_like(tensor_data)
print(new_tensor_data)
>>>tensor([[0, 0],
        [0, 0]])

# tensor_data와 동일한 크기의 자료형이지만 값이 전부 1인 텐서 생성
new_tensor_data = torch.ones_like(tensor_data)
print(new_tensor_data)
>>> tensor([[1, 1],
        [1, 1]])

# tensor_data와 동일한 크기의 자료형이지만 자료형은 float으로 덮어쓰고 값은 랜덤으로 채우기
new_tensor_data = torch.rand_like(tensor_data, dtype=torch.float32)
print(new_tensor_data)
>>> tensor([[0.2580, 0.6950],
        [0.5063, 0.0099]])

tensor 형변환 및 차원 조작

tensor는 numpy배열처럼 조작 가능 (tensor와 np는 비슷)

(1) tensor의 특정 차원 접근(indexing)

아래와 같은 tensor를 만들었다고 가정


tensor_data = torch.tensor([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12],
    [13,14,15,16]
])

print(tensor_data)
>>> tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16]])

tensor의 원소 indexing


# 원소 접근
print(tensor_data[0,0]) # 첫번째 원소 접근
>>> tensor(1)

print(tensor_data[-1,-1]) # 마지막 원소 접근
>>> tensor(16)

tensor 행 indexing


# 행 접근
print(tensor_data[0]) # 첫번째 행 접근
>>> tensor([1, 2, 3, 4])

print(tensor_data[1]) # 두번째 행 접근
>>> tensor([5, 6, 7, 8])

print(tensor_data[1:3]) # 첫번째 행 ~ 두번째 행 접근
>>> tensor([[ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

print(tensor_data[-1]) # 마지막 행 접근
>>> tensor([13, 14, 15, 16])

print(tensor_data[-2:-1]) # 마지막 행 접근
>>> tensor([[ 9, 10, 11, 12]])

tensor 열 indexing


# 열 접근
print(tensor_data[:,0]) # 첫번쨰 열 접근 == 모든 행(:)에서 첫번째 열 출력
>>> tensor([ 1,  5,  9, 13])

print(tensor_data[...,0]) # 첫번쨰 열 접근 == 모든 행(...)에서 첫번째 열 출력 
>>> tensor([ 1,  5,  9, 13])

tensor_data[:,0] vs tensor_data[…,0]의 차이점은?

tensor_data[..., 0]은 모든 차원에 걸쳐서 작동

→ 즉 마지막 차원의 첫번째 값에 접근

tensor_data[:, 0]은 첫번째 차원과 두번째 차원에만 영향을 미침

→ 즉, 두번째 차원의 첫번째 값에 접근

위 예제는 2차원에 한정되어 있어 헷갈릴만 함

3채널을 갖는 5x5크기의 tensor로 다시 생각해보자


# 3채널을 갖는 5x5 크기의 이미지라고 생각하고 만듬
tensor_test = torch.rand((3,5,5), dtype = torch.float32)
print(tensor_test) # 원본
>>> tensor([[[0.5344, 0.8263, 0.8238, 0.2038, 0.7460],
         [0.8512, 0.3118, 0.9364, 0.7568, 0.4285],
         [0.0068, 0.1527, 0.1556, 0.7344, 0.8795],
         [0.3109, 0.9227, 0.1063, 0.2597, 0.2508],
         [0.0292, 0.3631, 0.5051, 0.5458, 0.0895]],

        [[0.5134, 0.0746, 0.5357, 0.7415, 0.5086],
         [0.8754, 0.8747, 0.9809, 0.6926, 0.1750],
         [0.1540, 0.7346, 0.7030, 0.0397, 0.0471],
         [0.6680, 0.1338, 0.9194, 0.7807, 0.8862],
         [0.9758, 0.8952, 0.3774, 0.3773, 0.1065]],

        [[0.7793, 0.5237, 0.8530, 0.9483, 0.2488],
         [0.8617, 0.2278, 0.0692, 0.8315, 0.7045],
         [0.9290, 0.5866, 0.1318, 0.0254, 0.5772],
         [0.9803, 0.2799, 0.2412, 0.6680, 0.2312],
         [0.2925, 0.8373, 0.1044, 0.6306, 0.0562]]])

print(f'Using : {tensor_test[:,0]}') # : 는 해석하자면, 모든 채널의 첫번째 행을 구함
>>> Using : tensor([[0.5344, 0.8263, 0.8238, 0.2038, 0.7460],
        [0.5134, 0.0746, 0.5357, 0.7415, 0.5086],
        [0.7793, 0.5237, 0.8530, 0.9483, 0.2488]])

print(f'Using ... {tensor_test[...,0]}') # ...는 모든 채널에 열을 구함
>>> Using ... tensor([[0.5344, 0.8512, 0.0068, 0.3109, 0.0292],
        [0.5134, 0.8754, 0.1540, 0.6680, 0.9758],
        [0.7793, 0.8617, 0.9290, 0.9803, 0.2925]])

(2) tensor 연결(concatenate)

torch.cat([tensor, …, tensor])

dim: tensor를 concat하기 위한 축

행을 기준으로 이어 붙이기 (dim = 0)
열을 기준으로 이어 붙이기 (dim = 1)


tensor_data = torch.tensor([
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
])

# dim: tensor를 이어붙이기 위한 축
# 행을 기준으로 이어붙이기 (dim = 0 )
result = torch.cat([tensor_data, tensor_data, tensor_data], dim = 0)
print(result)
>>> tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

# 열을 기준으로 이어붙이기 (dim = 1)
result = torch.cat([tensor_data, tensor_data, tensor_data], dim = 1)
print(result)
>>> tensor([[ 1,  2,  3,  4,  1,  2,  3,  4,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  5,  6,  7,  8,  5,  6,  7,  8],
        [ 9, 10, 11, 12,  9, 10, 11, 12,  9, 10, 11, 12]])

3차원일 때는?


# 3채널을 갖는 5x5 크기의 이미지라고 생각하고 만듬
# 1 75까지 1차원 벡터 생성 후 reshape
tensor_test = torch.arange(1, 76).reshape(3, 5, 5).int()
print(tensor_test) 
>>> tensor([[[ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25]],

        [[26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50]],

        [[51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75]]], dtype=torch.int32)

result = torch.cat([tensor_test, tensor_test], dim = 0)
print(result)
>>> tensor([[[ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25]],

        [[26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50]],

        [[51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75]],

        [[ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25]],

        [[26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50]],

        [[51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75]]], dtype=torch.int32)

result = torch.cat([tensor_test, tensor_test], dim = 1)
print(result)
>>> tensor([[[ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25],
         [ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25]],

        [[26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50],
         [26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50]],

        [[51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75],
         [51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75]]], dtype=torch.int32)

result = torch.cat([tensor_test, tensor_test], dim = 2)
print(result)
>>> tensor([[[ 1,  2,  3,  4,  5,  1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10,  6,  7,  8,  9, 10],
         [11, 12, 13, 14, 15, 11, 12, 13, 14, 15],
         [16, 17, 18, 19, 20, 16, 17, 18, 19, 20],
         [21, 22, 23, 24, 25, 21, 22, 23, 24, 25]],

        [[26, 27, 28, 29, 30, 26, 27, 28, 29, 30],
         [31, 32, 33, 34, 35, 31, 32, 33, 34, 35],
         [36, 37, 38, 39, 40, 36, 37, 38, 39, 40],
         [41, 42, 43, 44, 45, 41, 42, 43, 44, 45],
         [46, 47, 48, 49, 50, 46, 47, 48, 49, 50]],

        [[51, 52, 53, 54, 55, 51, 52, 53, 54, 55],
         [56, 57, 58, 59, 60, 56, 57, 58, 59, 60],
         [61, 62, 63, 64, 65, 61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70, 66, 67, 68, 69, 70],
         [71, 72, 73, 74, 75, 71, 72, 73, 74, 75]]], dtype=torch.int32

(3) tensor 자료형 변환(type casting)

tensor의 자료형(int, float 등) 변환 가능


# type casting 이란?
a, b = 3, 4.0 # a = 3, b = 4.0 각각 int형 float형
print(f'type(a): {type(a)}, type(b): {type(b)}')
>>> type(a): <class 'int'>, type(b): <class 'float'>

# a 는 자동으로 float으로 type casting되어 연산
print(a + b)
>>> 7.0

# b를 int형으로 type casting하여 연산
print(a + int(b))
>>> 7


int_a = torch.tensor([2], dtype=torch.int)
float_b = torch.tensor([5.0])

print(int_a.dtype)
>>> torch.int32

print(float_b.dtype)
>>> torch.float32

# tensr int_a는 자동으로 float32로 형변환 처리
print(int_a + float_b) # 2 + 5.0 = 7.0
>>> tensor([7.])

# tensor float_b를 int32로 type casting 하여 덧셈 수행
print(int_a + float_b.type(torch.int32))
>>> tensor([7], dtype=torch.int32)

(4) tensor의 shape 변경

torch.view() 또는 torch.reshape()으로 tensor의 reshape 변경 가능

reshape 변경 시 순서는 변경 X

view는 contiguous 속성이 만족되지 않는 경우 일부 사용이 제한됨

contiguous가 무엇이냐?

[Pytorch] contiguous 원리와 의미

torch의 contiguous에 대해서 안녕하세요. 이번 시간에는 파이토치에서 메모리 내에서의 자료형 저장 상태로 등장하는 contiguous의 원리와 의미에 대해서 간단히 살펴보도록 하겠습니다. contiguous 여부와 stride 의미 간단한 예시를 들어 설명하기 위해서 shape이 (4, 3)으로 동일한 두 tensor a, b를 다음과 같이 선언해보겠습니다. import torch a = torch.randn(3, 4) a.transpose_(0, 1) b = torch.randn(4, 3) # 두 tensor는 모두 (4, 3) shape print(a) ''' tensor([[-0.7290, 0.7509, 1.1666], [-0.9321, -0.4360, -0.2715], [ 0.1232,..

https://jimmy-ai.tistory.com/122

고로 그냥 reshape이 편하다~(np도 reshape이기도 하고)\


tensor_data = torch.arange(1, 13)
print(tensor_data)
>>> tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

clone_reshape_data = tensor_data.clone().reshape(3,4)
print(clone_reshape_data)
>>> tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

reshape_data = tensor_data.reshape(3,4)
print(reshape_data)
>>> tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

view_data = tensor_data.view(4,3)
print(view_data)
>>> tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

tensor_data[0] = 10

print(clone_reshape_data)
>>> tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

print(reshape_data)
>>> tensor([[10,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

print(view_data)
>>> tensor([[10,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

(5) tensor의 차원 교환

하나의 tensor에서 특정한 차원(축) 끼리 순서를 교환 할 수 있음

torch를 제외한 대부분의 라이브러리에서 image를 불러오면 channel수는 대부분 마지막에 올라가게 된다.

ex) 32x32 크기의 color 이미지 → (32, 32, 3):(높이,너비,채널수)

이뿐만 아니라 차원들을 조작해야할 잦은 상황이 있을 때 permute() 메소드를 활용해 쉽게 교환 가능하다.


# 교환이 잘 됐는지 확인하기 위해 (64, 32, 3)으로 설정함
tensor_test = torch.rand((64, 32, 3))
print(tensor_test.shape)
>>>torch.Size([64, 32, 3])

tensor_test = tensor_test.permute(2,0,1) # 각각의 축을 교환
# 0번 축 -> 두번째로 옮김
# 1번 축 -> 세번째로 옮김
# 2번 축 -> 첫번째로 옮김
print(tensor_test.shape)
>>> torch.Size([3, 64, 32])

tensor 연산과 함수

tensor에서도 numpy와 동일하게 기본적인 사칙연산과 dot product등 기본적인 연산 수행 가능

(1) tensor의 연산

numpy와 동일

tensor의 사칙연산

같은 크기를 가진 두개의 tensor 에서 사칙연산 가능

각 원소 별 연산


a = torch.arange(1, 4 + 1).reshape(2,2)
print(a)
>>> tensor([[1, 2],
        [3, 4]])

b = torch.arange(5, 8 + 1).reshape(2,2)
print(b)
>>> tensor([[5, 6],
        [7, 8]])

print(a + b) # 덧셈
>>> tensor([[ 6,  8],
        [10, 12]])

print(a - b) # 뺄셈
>>> tensor([[-4, -4],
        [-4, -4]])

print(a * b) # 원소 별 곱셈
>>> tensor([[ 5, 12],
        [21, 32]])

print(a / b) # 원소 별 나눗셈
>>> tensor([[0.2000, 0.3333],
        [0.4286, 0.5000]])

tensor의 행렬 곱(matmul)


# 방법 1
print(a.matmul(b))
>>> tensor([[19, 22],
        [43, 50]])

# 방법 2
print(torch.matmul(a, b))
>>> tensor([[19, 22],
        [43, 50]])

(2) tensor의 평균 구하는 함수

torch.dtype = float32에서만 사용 가능

torch.mean()

오류

torch.mean() 함수는 기본적으로 dtype=torch.float32 에서 동작함


# 2x4 tensor 생성
tensor_test = torch.tensor([
    [1,2,3,4],
    [5,6,7,8]
])

tensor_data = torch.arange(1, 8 + 1).reshape(2, 4)

print(tensor_test.dtype)
>>> torch.int64

print(tensor_data.dtype)
>>> torch.int64

print(tensor_test.mean()) # 오류 
print(tensor_data.mean()) # 오류

해결


# 2x4 tensor 생성
tensor_test = torch.Tensor([
    [1,2,3,4],
    [5,6,7,8]
])

print(tensor_test.mean()) # 모든 원소에 대한 평균 계산
>>> tensor(4.5000)

print(tensor_test.mean(dim = 0)) # 각 열에 대하여 평균 계산 (모든 행에 대하여 행 기준으로 계산)

>>> tensor([3., 4., 5., 6.])

print(tensor_test.mean(dim = 1)) # 각 행에 대하여 평균 계산 (모든 열에 대하여 열 기준으로 계산)
>>> tensor([2.5000, 6.5000])

무엇이 문제였나?

torch.mean() 함수는 기본적으로 dtype=torch.float32 에서 동작함

생성한 tensor가 int형이었으므로 문제가 생김

문제 1


tensor_test = torch.tensor([
    [1.0,2,3,4],
    [5.0,6,7,8]
])
print(tensor_test.dtype)
>>> torch.float32

해결방법 1: type casting


print(tensor_test.dtype)
>>> torch.int64

tensor_test = tensor_test.type(dtype = torch.float32)

print(tensor_test.dtype)
>>> torch.float32

해결방법 2: tensor 생성 시 원소에 부동소수점 삽입


tensor_test = torch.tensor([
    [1.0,2,3,4],
    [5,6,7,8]
])
print(tensor_test.dtype)
>>> torch.float32

해결방법 3: torch.Tensor([])로 생성


tensor_test = torch.Tensor([
    [1,2,3,4],
    [5,6,7,8]
])
print(tensor_test.dtype)
>>> torch.float32

torch.tensor([]) 와 torch.Tensor([]) 의 차이점??

torch.tensor([])란?

torch.tensor 함수는 데이터를 인자로 받아 해당 데이터로부터 새로운 텐서를 생성함

데이터의 타입을 자동으로 유추하여 새 텐서를 만듬

torch.Tensor([])란?

torch.tensor 함수는 데이터를 인자로 받아 해당 데이터로부터 새로운 텐서를 생성함

데이터의 타입을 자동으로 유추하여 새 텐서를 만듬

torch.Tensor는 PyTorch의 기본 텐서 생성자이며 주어진 크기의 새로운 텐서를 생성

torch.Tensor 생성자는 초기화되지 않은 텐서를 만들 수 있음

ex) torch.Tensor(2, 3) → 2x3 크기의 초기화되지 않은 텐서를 생성

torch.Tensor를 사용하여 리스트나 배열로부터 텐서를 만들 때 torch.tensor와 유사하게 동작하지만 기본 데이터 타입은 torch.float32입니다.

(3) tensor의 합계 함수

자료형에 상관없이 사용 가능

torch.sum()


tensor_data = torch.tensor([
    [1,2,3,4],
    [5,6,7,8]
])

print(tensor_data.sum()) # 모든 원소에 대한 합 계산
>>> tensor(36)

print(tensor_data.sum(dim = 0)) # 각 열에 대하여 합 계산 (모든 행에 대하여 행 기준으로 계산)
>>> tensor([ 6,  8, 10, 12])

print(tensor_data.sum(dim = 1)) # 각 행에 대하여 합 계산 (모든 열에 대하여 열 기준으로 계산)
>>> tensor([10, 26])

(4) tensor의 최대 함수

자료형에 상관없이 사용 가능

torch.max(): 원소의 최대값을 반환

torch.argmax(): 가장 큰 원소의 인덱스를 반환


tensor_data = torch.arange(1, 9 + 1).reshape(3,3)
print(tensor_data)
>>> tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

print(tensor_data.max()) # 전체 원소에 대한 최댓값
>>> tensor(9)

print(tensor_data.argmax()) # 전체 원소에 대한 최댓값에 대한 인덱스
>>> tensor(8)

print(tensor_data.max(dim = 0)) # 각 열에 대하여 최댓값 계산
>>> torch.return_types.max(
values=tensor([7, 8, 9]),
indices=tensor([2, 2, 2]))

print(tensor_data.max(dim = 1)) # 각 행에 대하여 최댓값 계산
>>> torch.return_types.max(
values=tensor([3, 6, 9]),
indices=tensor([2, 2, 2]))

(5) tensor의 차원 줄이고 늘리기

unsqueeze() 함수는 크기가 1인 차원을 추가

흔히 배치(batch) 차원을 추가하기 위한 목적으로 사용

squeeze() 함수는 크기가 1인 차원을 제거


tensor_data = torch.arange(1, 8 + 1).reshape(2,4)
print(tensor_data)
>>> tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])

# 차원 shpae 확인
print(tensor_data.shape)
>>> torch.Size([2, 4])

# 첫번째 차원에 차원 추가
tensor_data = tensor_data.unsqueeze(0)
print(tensor_data.shape)
>>> torch.Size([1, 2, 4])

# 마지막 차원에 차원 추가
tensor_data = tensor_data.unsqueeze(-1)
print(tensor_data.shape)
>>> torch.Size([1, 2, 4, 1])

# 차원의 크기가 1인 차원들 제거
tensor_data = tensor_data.squeeze()
print(tensor_data.shape)
>>> torch.Size([2, 4])

자동 미분과 기울기(Gradient)

PyTorch에서는 연산에 대한 자동 미분 수행 가능


import torch

# requires_grad를 설정할 때만 기울기 추적할 수 있음
x = torch.tensor([1.0, 4], requires_grad=True)
y = torch.tensor([5.0, 6], requires_grad=True)
z = x + y 

print(z)
>>> tensor([ 5.5, 10.], grad_fn=<AddBackward0>)

print(z.grad_fn)
>>> <AddBackward0 object at 0x000002E49040BB70>

out = z.mean()
print(out)
>>> tensor(7.5000, grad_fn=<MeanBackward0>)

print(out.grad_fn) 
>>> <MeanBackward0 object at 0x000002E4903C72E8>

out.backward()
print(x.grad)
>>> tensor([0.5000, 0.5000])

print(y.grad)
>>> tensor([0.5000, 0.5000])

print(z.grad) # leaf variable에 대해서만 gradient 추적 가능
>>> None


import torch

# 텐서 생성과 autograd 설정
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

# Forward pass: 텐서에 대한 연산 수행
y = x * x  # 예: y = x^2

# 손실 계산
loss = y.sum()

# Backward pass: 기울기 계산
loss.backward()

# 기울기 확인
print(x.grad)  # x에 대한 y의 기울기
>>> tensor([2., 4., 6.])

일반적으로 모델을 학습할 때는 기울기 추적 O

학습된 모델을 사용할 때는 파라미터를 업데이트하지 않으므로, 기울기 추적 X

See more posts

[Review] “Told You I Didn’t Like It”: Exploiting Uninteresting Items for Effective Collaborative Filtering

February 6, 2024