init fusion lcd orin config
This commit is contained in:
4
.gitignore
vendored
Normal file
4
.gitignore
vendored
Normal file
@@ -0,0 +1,4 @@
|
||||
result/
|
||||
sequences/
|
||||
KITTI/
|
||||
KITTI360/
|
||||
29
ALIKE/LICENSE
Normal file
29
ALIKE/LICENSE
Normal file
@@ -0,0 +1,29 @@
|
||||
BSD 3-Clause License
|
||||
|
||||
Copyright (c) 2022, Zhao Xiaoming
|
||||
All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright notice, this
|
||||
list of conditions and the following disclaimer.
|
||||
|
||||
2. Redistributions in binary form must reproduce the above copyright notice,
|
||||
this list of conditions and the following disclaimer in the documentation
|
||||
and/or other materials provided with the distribution.
|
||||
|
||||
3. Neither the name of the copyright holder nor the names of its
|
||||
contributors may be used to endorse or promote products derived from
|
||||
this software without specific prior written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
||||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
||||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
||||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
||||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
||||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
||||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
117
ALIKE/README.md
Normal file
117
ALIKE/README.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# News: the cpp version is released [ALIKE-cpp](https://github.com/Shiaoming/ALIKE-cpp).
|
||||
|
||||
# ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction
|
||||
|
||||
ALIKE applies a differentiable keypoint detection module to detect accurate sub-pixel keypoints. The network can run at 95 frames per second for 640 x 480 images on NVIDIA Titan X (Pascal) GPU and achieve equivalent performance with the state-of-the-arts. ALIKE benefits real-time applications in resource-limited platforms/devices. Technical details are described in [this paper](https://arxiv.org/pdf/2112.02906.pdf).
|
||||
|
||||
> ```
|
||||
> Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter C. Y. Chen, Zhengguo Li, "ALIKE: Accurate and Lightweight Keypoint
|
||||
> Detection and Descriptor Extraction," IEEE Transactions on Multimedia, 2022.
|
||||
> ```
|
||||
|
||||

|
||||
|
||||
|
||||
If you use ALIKE in an academic work, please cite:
|
||||
|
||||
```
|
||||
@article{Zhao2022ALIKE,
|
||||
title = {ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction},
|
||||
url = {http://arxiv.org/abs/2112.02906},
|
||||
doi = {10.1109/TMM.2022.3155927},
|
||||
journal = {IEEE Transactions on Multimedia},
|
||||
author = {Zhao, Xiaoming and Wu, Xingming and Miao, Jinyu and Chen, Weihai and Chen, Peter C. Y. and Li, Zhengguo},
|
||||
month = march,
|
||||
year = {2022},
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
The required packages are listed in the `requirements.txt` :
|
||||
|
||||
```shell
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 2. Models
|
||||
|
||||
The off-the-shelf weights of four variant ALIKE models are provided in `models/` .
|
||||
|
||||
|
||||
|
||||
## 3. Run demo
|
||||
|
||||
```shell
|
||||
$ python demo.py -h
|
||||
usage: demo.py [-h] [--model {alike-t,alike-s,alike-n,alike-l}]
|
||||
[--device DEVICE] [--top_k TOP_K] [--scores_th SCORES_TH]
|
||||
[--n_limit N_LIMIT] [--no_display] [--no_sub_pixel]
|
||||
input
|
||||
|
||||
ALike Demo.
|
||||
|
||||
positional arguments:
|
||||
input Image directory or movie file or "camera0" (for
|
||||
webcam0).
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--model {alike-t,alike-s,alike-n,alike-l}
|
||||
The model configuration
|
||||
--device DEVICE Running device (default: cuda).
|
||||
--top_k TOP_K Detect top K keypoints. -1 for threshold based mode,
|
||||
>0 for top K mode. (default: -1)
|
||||
--scores_th SCORES_TH
|
||||
Detector score threshold (default: 0.2).
|
||||
--n_limit N_LIMIT Maximum number of keypoints to be detected (default:
|
||||
5000).
|
||||
--no_display Do not display images to screen. Useful if running
|
||||
remotely (default: False).
|
||||
--no_sub_pixel Do not detect sub-pixel keypoints (default: False).
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 4. Examples
|
||||
|
||||
### KITTI example
|
||||
```shell
|
||||
python demo.py assets/kitti
|
||||
```
|
||||

|
||||
|
||||
### TUM example
|
||||
```shell
|
||||
python demo.py assets/tum
|
||||
```
|
||||

|
||||
|
||||
## 5. Efficiency and performance
|
||||
|
||||
| Models | Parameters | GFLOPs(640x480) | MHA@3 on Hpatches | mAA(10°) on [IMW2020-test](https://www.cs.ubc.ca/research/image-matching-challenge/2021/leaderboard) (Stereo) |
|
||||
|:---:|:---:|:---:|:-----------------:|:-------------------------------------------------------------------------------------------------------------:|
|
||||
| D2-Net(MS) | 7653KB | 889.40 | 38.33% | 12.27% |
|
||||
| LF-Net(MS) | 2642KB | 24.37 | 57.78% | 23.44% |
|
||||
| SuperPoint | 1301KB | 26.11 | 70.19% | 28.97% |
|
||||
| R2D2(MS) | 484KB | 464.55 | 71.48% | 39.02% |
|
||||
| ASLFeat(MS) | 823KB | 77.58 | 73.52% | 33.65% |
|
||||
| DISK | 1092KB | 98.97 | 70.56% | 51.22% |
|
||||
| ALike-N | 318KB | 7.909 | 75.74% | 47.18% |
|
||||
| ALike-L | 653KB | 19.685 | 76.85% | 49.58% |
|
||||
|
||||
### Evaluation on Hpatches
|
||||
|
||||
- Download [hpatches-sequences-release](https://hpatches.github.io/) and put it into `hseq/hpatches-sequences-release`.
|
||||
- Remove the unreliable sequences as D2-Net.
|
||||
- Run the following command to evaluate the performance:
|
||||
```shell
|
||||
python hseq/eval.py
|
||||
```
|
||||
|
||||
|
||||
For more details, please refer to the [paper](https://arxiv.org/abs/2112.02906).
|
||||
BIN
ALIKE/__pycache__/alike.cpython-38.pyc
Normal file
BIN
ALIKE/__pycache__/alike.cpython-38.pyc
Normal file
Binary file not shown.
BIN
ALIKE/__pycache__/alnet.cpython-38.pyc
Normal file
BIN
ALIKE/__pycache__/alnet.cpython-38.pyc
Normal file
Binary file not shown.
BIN
ALIKE/__pycache__/soft_detect.cpython-38.pyc
Normal file
BIN
ALIKE/__pycache__/soft_detect.cpython-38.pyc
Normal file
Binary file not shown.
143
ALIKE/alike.py
Normal file
143
ALIKE/alike.py
Normal file
@@ -0,0 +1,143 @@
|
||||
import logging
|
||||
import os
|
||||
import cv2
|
||||
import torch
|
||||
from copy import deepcopy
|
||||
import torch.nn.functional as F
|
||||
from torchvision.transforms import ToTensor
|
||||
import math
|
||||
|
||||
from ALIKE.alnet import ALNet
|
||||
from ALIKE.soft_detect import DKD
|
||||
import time
|
||||
|
||||
configs = {
|
||||
'alike-t': {'c1': 8, 'c2': 16, 'c3': 32, 'c4': 64, 'dim': 64, 'single_head': True, 'radius': 2,
|
||||
'model_path': os.path.join(os.path.split(__file__)[0], 'models', 'alike-t.pth')},
|
||||
'alike-s': {'c1': 8, 'c2': 16, 'c3': 48, 'c4': 96, 'dim': 96, 'single_head': True, 'radius': 2,
|
||||
'model_path': os.path.join(os.path.split(__file__)[0], 'models', 'alike-s.pth')},
|
||||
'alike-n': {'c1': 16, 'c2': 32, 'c3': 64, 'c4': 128, 'dim': 128, 'single_head': True, 'radius': 2,
|
||||
'model_path': os.path.join(os.path.split(__file__)[0], 'models', 'alike-n.pth')},
|
||||
'alike-l': {'c1': 32, 'c2': 64, 'c3': 128, 'c4': 128, 'dim': 128, 'single_head': False, 'radius': 2,
|
||||
'model_path': os.path.join(os.path.split(__file__)[0], 'models', 'alike-l.pth')},
|
||||
}
|
||||
|
||||
|
||||
class ALike(ALNet):
|
||||
def __init__(self,
|
||||
# ================================== feature encoder
|
||||
c1: int = 32, c2: int = 64, c3: int = 128, c4: int = 128, dim: int = 128,
|
||||
single_head: bool = False,
|
||||
# ================================== detect parameters
|
||||
radius: int = 2,
|
||||
top_k: int = 500, scores_th: float = 0.5,
|
||||
n_limit: int = 5000,
|
||||
device: str = 'cpu',
|
||||
model_path: str = ''
|
||||
):
|
||||
super().__init__(c1, c2, c3, c4, dim, single_head)
|
||||
self.radius = radius
|
||||
self.top_k = top_k
|
||||
self.n_limit = n_limit
|
||||
self.scores_th = scores_th
|
||||
self.dkd = DKD(radius=self.radius, top_k=self.top_k,
|
||||
scores_th=self.scores_th, n_limit=self.n_limit)
|
||||
self.device = device
|
||||
|
||||
if model_path != '':
|
||||
state_dict = torch.load(model_path, self.device)
|
||||
self.load_state_dict(state_dict)
|
||||
self.to(self.device)
|
||||
self.eval()
|
||||
logging.info(f'Loaded model parameters from {model_path}')
|
||||
logging.info(
|
||||
f"Number of model parameters: {sum(p.numel() for p in self.parameters() if p.requires_grad) / 1e3}KB")
|
||||
|
||||
def extract_dense_map(self, image, ret_dict=False):
|
||||
# ====================================================
|
||||
# check image size, should be integer multiples of 2^5
|
||||
# if it is not a integer multiples of 2^5, padding zeros
|
||||
device = image.device
|
||||
b, c, h, w = image.shape
|
||||
h_ = math.ceil(h / 32) * 32 if h % 32 != 0 else h
|
||||
w_ = math.ceil(w / 32) * 32 if w % 32 != 0 else w
|
||||
if h_ != h:
|
||||
h_padding = torch.zeros(b, c, h_ - h, w, device=device)
|
||||
image = torch.cat([image, h_padding], dim=2)
|
||||
if w_ != w:
|
||||
w_padding = torch.zeros(b, c, h_, w_ - w, device=device)
|
||||
image = torch.cat([image, w_padding], dim=3)
|
||||
# ====================================================
|
||||
|
||||
scores_map, descriptor_map = super().forward(image)
|
||||
|
||||
# ====================================================
|
||||
if h_ != h or w_ != w:
|
||||
descriptor_map = descriptor_map[:, :, :h, :w]
|
||||
scores_map = scores_map[:, :, :h, :w] # Bx1xHxW
|
||||
# ====================================================
|
||||
|
||||
# BxCxHxW
|
||||
descriptor_map = torch.nn.functional.normalize(descriptor_map, p=2, dim=1)
|
||||
|
||||
if ret_dict:
|
||||
return {'descriptor_map': descriptor_map, 'scores_map': scores_map, }
|
||||
else:
|
||||
return descriptor_map, scores_map
|
||||
|
||||
def forward(self, img, image_size_max=99999, sort=False, sub_pixel=False):
|
||||
"""
|
||||
:param img: np.array HxWx3, RGB
|
||||
:param image_size_max: maximum image size, otherwise, the image will be resized
|
||||
:param sort: sort keypoints by scores
|
||||
:param sub_pixel: whether to use sub-pixel accuracy
|
||||
:return: a dictionary with 'keypoints', 'descriptors', 'scores', and 'time'
|
||||
"""
|
||||
H, W, three = img.shape
|
||||
assert three == 3, "input image shape should be [HxWx3]"
|
||||
|
||||
# ==================== image size constraint
|
||||
image = deepcopy(img)
|
||||
max_hw = max(H, W)
|
||||
if max_hw > image_size_max:
|
||||
ratio = float(image_size_max / max_hw)
|
||||
image = cv2.resize(image, dsize=None, fx=ratio, fy=ratio)
|
||||
|
||||
# ==================== convert image to tensor
|
||||
image = torch.from_numpy(image).to(self.device).to(torch.float32).permute(2, 0, 1)[None] / 255.0
|
||||
|
||||
# ==================== extract keypoints
|
||||
start = time.time()
|
||||
|
||||
with torch.no_grad():
|
||||
descriptor_map, scores_map = self.extract_dense_map(image)
|
||||
keypoints, descriptors, scores, _ = self.dkd(scores_map, descriptor_map,
|
||||
sub_pixel=sub_pixel)
|
||||
keypoints, descriptors, scores = keypoints[0], descriptors[0], scores[0]
|
||||
keypoints = (keypoints + 1) / 2 * keypoints.new_tensor([[W - 1, H - 1]])
|
||||
|
||||
if sort:
|
||||
indices = torch.argsort(scores, descending=True)
|
||||
keypoints = keypoints[indices]
|
||||
descriptors = descriptors[indices]
|
||||
scores = scores[indices]
|
||||
|
||||
end = time.time()
|
||||
|
||||
return {'keypoints': keypoints.cpu().numpy(),
|
||||
'descriptors': descriptors.cpu().numpy(),
|
||||
'scores': scores.cpu().numpy(),
|
||||
'scores_map': scores_map.cpu().numpy(),
|
||||
'time': end - start, }
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
import numpy as np
|
||||
from thop import profile
|
||||
|
||||
net = ALike(c1=32, c2=64, c3=128, c4=128, dim=128, single_head=False)
|
||||
|
||||
image = np.random.random((640, 480, 3)).astype(np.float32)
|
||||
flops, params = profile(net, inputs=(image, 9999, False), verbose=False)
|
||||
print('{:<30} {:<8} GFLops'.format('Computational complexity: ', flops / 1e9))
|
||||
print('{:<30} {:<8} KB'.format('Number of parameters: ', params / 1e3))
|
||||
163
ALIKE/alnet.py
Normal file
163
ALIKE/alnet.py
Normal file
@@ -0,0 +1,163 @@
|
||||
import torch
|
||||
from torch import nn
|
||||
from torchvision.models import resnet
|
||||
from typing import Optional, Callable
|
||||
|
||||
|
||||
class ConvBlock(nn.Module):
|
||||
def __init__(self, in_channels, out_channels,
|
||||
gate: Optional[Callable[..., nn.Module]] = None,
|
||||
norm_layer: Optional[Callable[..., nn.Module]] = None):
|
||||
super().__init__()
|
||||
if gate is None:
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
else:
|
||||
self.gate = gate
|
||||
if norm_layer is None:
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.conv1 = resnet.conv3x3(in_channels, out_channels)
|
||||
self.bn1 = norm_layer(out_channels)
|
||||
self.conv2 = resnet.conv3x3(out_channels, out_channels)
|
||||
self.bn2 = norm_layer(out_channels)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.gate(self.bn1(self.conv1(x))) # B x in_channels x H x W
|
||||
x = self.gate(self.bn2(self.conv2(x))) # B x out_channels x H x W
|
||||
return x
|
||||
|
||||
|
||||
# copied from torchvision\models\resnet.py#27->BasicBlock
|
||||
class ResBlock(nn.Module):
|
||||
expansion: int = 1
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
inplanes: int,
|
||||
planes: int,
|
||||
stride: int = 1,
|
||||
downsample: Optional[nn.Module] = None,
|
||||
groups: int = 1,
|
||||
base_width: int = 64,
|
||||
dilation: int = 1,
|
||||
gate: Optional[Callable[..., nn.Module]] = None,
|
||||
norm_layer: Optional[Callable[..., nn.Module]] = None
|
||||
) -> None:
|
||||
super(ResBlock, self).__init__()
|
||||
if gate is None:
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
else:
|
||||
self.gate = gate
|
||||
if norm_layer is None:
|
||||
norm_layer = nn.BatchNorm2d
|
||||
if groups != 1 or base_width != 64:
|
||||
raise ValueError('ResBlock only supports groups=1 and base_width=64')
|
||||
if dilation > 1:
|
||||
raise NotImplementedError("Dilation > 1 not supported in ResBlock")
|
||||
# Both self.conv1 and self.downsample layers downsample the input when stride != 1
|
||||
self.conv1 = resnet.conv3x3(inplanes, planes, stride)
|
||||
self.bn1 = norm_layer(planes)
|
||||
self.conv2 = resnet.conv3x3(planes, planes)
|
||||
self.bn2 = norm_layer(planes)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
||||
identity = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.gate(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(x)
|
||||
|
||||
out += identity
|
||||
out = self.gate(out)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
class ALNet(nn.Module):
|
||||
def __init__(self, c1: int = 32, c2: int = 64, c3: int = 128, c4: int = 128, dim: int = 128,
|
||||
single_head: bool = True,
|
||||
):
|
||||
super().__init__()
|
||||
self.feature_size = dim
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
|
||||
self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
|
||||
self.pool4 = nn.MaxPool2d(kernel_size=4, stride=4)
|
||||
|
||||
self.block1 = ConvBlock(3, c1, self.gate, nn.BatchNorm2d)
|
||||
|
||||
self.block2 = ResBlock(inplanes=c1, planes=c2, stride=1,
|
||||
downsample=nn.Conv2d(c1, c2, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
self.block3 = ResBlock(inplanes=c2, planes=c3, stride=1,
|
||||
downsample=nn.Conv2d(c2, c3, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
self.block4 = ResBlock(inplanes=c3, planes=c4, stride=1,
|
||||
downsample=nn.Conv2d(c3, c4, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
# ================================== feature aggregation
|
||||
self.conv1 = resnet.conv1x1(c1, dim // 4)
|
||||
self.conv2 = resnet.conv1x1(c2, dim // 4)
|
||||
self.conv3 = resnet.conv1x1(c3, dim // 4)
|
||||
self.conv4 = resnet.conv1x1(c4, dim // 4)
|
||||
self.upsample2 = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
|
||||
self.upsample4 = nn.Upsample(scale_factor=4, mode='bilinear', align_corners=True)
|
||||
self.upsample8 = nn.Upsample(scale_factor=8, mode='bilinear', align_corners=True)
|
||||
self.upsample32 = nn.Upsample(scale_factor=32, mode='bilinear', align_corners=True)
|
||||
|
||||
# ================================== detector and descriptor head
|
||||
self.single_head = single_head
|
||||
if not self.single_head:
|
||||
self.convhead1 = resnet.conv1x1(dim, dim)
|
||||
self.convhead2 = resnet.conv1x1(dim, dim + 1)
|
||||
|
||||
def forward(self, image):
|
||||
# ================================== feature encoder
|
||||
x1 = self.block1(image) # B x c1 x H x W
|
||||
x2 = self.pool2(x1)
|
||||
x2 = self.block2(x2) # B x c2 x H/2 x W/2
|
||||
x3 = self.pool4(x2)
|
||||
x3 = self.block3(x3) # B x c3 x H/8 x W/8
|
||||
x4 = self.pool4(x3)
|
||||
x4 = self.block4(x4) # B x dim x H/32 x W/32
|
||||
|
||||
# ================================== feature aggregation
|
||||
x1 = self.gate(self.conv1(x1)) # B x dim//4 x H x W
|
||||
x2 = self.gate(self.conv2(x2)) # B x dim//4 x H//2 x W//2
|
||||
x3 = self.gate(self.conv3(x3)) # B x dim//4 x H//8 x W//8
|
||||
x4 = self.gate(self.conv4(x4)) # B x dim//4 x H//32 x W//32
|
||||
x2_up = self.upsample2(x2) # B x dim//4 x H x W
|
||||
x3_up = self.upsample8(x3) # B x dim//4 x H x W
|
||||
x4_up = self.upsample32(x4) # B x dim//4 x H x W
|
||||
x1234 = torch.cat([x1, x2_up, x3_up, x4_up], dim=1)
|
||||
|
||||
# ================================== detector and descriptor head
|
||||
if not self.single_head:
|
||||
x1234 = self.gate(self.convhead1(x1234))
|
||||
x = self.convhead2(x1234) # B x dim+1 x H x W
|
||||
|
||||
descriptor_map = x[:, :-1, :, :]
|
||||
scores_map = torch.sigmoid(x[:, -1, :, :]).unsqueeze(1)
|
||||
|
||||
return scores_map, descriptor_map
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
from thop import profile
|
||||
|
||||
net = ALNet(c1=16, c2=32, c3=64, c4=128, dim=128, single_head=True)
|
||||
|
||||
image = torch.randn(1, 3, 640, 480)
|
||||
flops, params = profile(net, inputs=(image,), verbose=False)
|
||||
print('{:<30} {:<8} GFLops'.format('Computational complexity: ', flops / 1e9))
|
||||
print('{:<30} {:<8} KB'.format('Number of parameters: ', params / 1e3))
|
||||
167
ALIKE/demo.py
Normal file
167
ALIKE/demo.py
Normal file
@@ -0,0 +1,167 @@
|
||||
import copy
|
||||
import os
|
||||
import cv2
|
||||
import glob
|
||||
import logging
|
||||
import argparse
|
||||
import numpy as np
|
||||
from tqdm import tqdm
|
||||
from alike import ALike, configs
|
||||
|
||||
|
||||
class ImageLoader(object):
|
||||
def __init__(self, filepath: str):
|
||||
self.N = 3000
|
||||
if filepath.startswith('camera'):
|
||||
camera = int(filepath[6:])
|
||||
self.cap = cv2.VideoCapture(camera)
|
||||
if not self.cap.isOpened():
|
||||
raise IOError(f"Can't open camera {camera}!")
|
||||
logging.info(f'Opened camera {camera}')
|
||||
self.mode = 'camera'
|
||||
elif os.path.exists(filepath):
|
||||
if os.path.isfile(filepath):
|
||||
self.cap = cv2.VideoCapture(filepath)
|
||||
if not self.cap.isOpened():
|
||||
raise IOError(f"Can't open video {filepath}!")
|
||||
rate = self.cap.get(cv2.CAP_PROP_FPS)
|
||||
self.N = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT)) - 1
|
||||
duration = self.N / rate
|
||||
logging.info(f'Opened video {filepath}')
|
||||
logging.info(f'Frames: {self.N}, FPS: {rate}, Duration: {duration}s')
|
||||
self.mode = 'video'
|
||||
else:
|
||||
self.images = glob.glob(os.path.join(filepath, '*.png')) + \
|
||||
glob.glob(os.path.join(filepath, '*.jpg')) + \
|
||||
glob.glob(os.path.join(filepath, '*.ppm'))
|
||||
self.images.sort()
|
||||
self.N = len(self.images)
|
||||
logging.info(f'Loading {self.N} images')
|
||||
self.mode = 'images'
|
||||
else:
|
||||
raise IOError('Error filepath (camerax/path of images/path of videos): ', filepath)
|
||||
|
||||
def __getitem__(self, item):
|
||||
if self.mode == 'camera' or self.mode == 'video':
|
||||
if item > self.N:
|
||||
return None
|
||||
ret, img = self.cap.read()
|
||||
if not ret:
|
||||
raise "Can't read image from camera"
|
||||
if self.mode == 'video':
|
||||
self.cap.set(cv2.CAP_PROP_POS_FRAMES, item)
|
||||
elif self.mode == 'images':
|
||||
filename = self.images[item]
|
||||
img = cv2.imread(filename)
|
||||
if img is None:
|
||||
raise Exception('Error reading image %s' % filename)
|
||||
return img
|
||||
|
||||
def __len__(self):
|
||||
return self.N
|
||||
|
||||
|
||||
class SimpleTracker(object):
|
||||
def __init__(self):
|
||||
self.pts_prev = None
|
||||
self.desc_prev = None
|
||||
|
||||
def update(self, img, pts, desc):
|
||||
N_matches = 0
|
||||
if self.pts_prev is None:
|
||||
self.pts_prev = pts
|
||||
self.desc_prev = desc
|
||||
|
||||
out = copy.deepcopy(img)
|
||||
for pt1 in pts:
|
||||
p1 = (int(round(pt1[0])), int(round(pt1[1])))
|
||||
cv2.circle(out, p1, 1, (0, 0, 255), -1, lineType=16)
|
||||
else:
|
||||
matches = self.mnn_mather(self.desc_prev, desc)
|
||||
mpts1, mpts2 = self.pts_prev[matches[:, 0]], pts[matches[:, 1]]
|
||||
N_matches = len(matches)
|
||||
|
||||
out = copy.deepcopy(img)
|
||||
for pt1, pt2 in zip(mpts1, mpts2):
|
||||
p1 = (int(round(pt1[0])), int(round(pt1[1])))
|
||||
p2 = (int(round(pt2[0])), int(round(pt2[1])))
|
||||
cv2.line(out, p1, p2, (0, 255, 0), lineType=16)
|
||||
cv2.circle(out, p2, 1, (0, 0, 255), -1, lineType=16)
|
||||
|
||||
self.pts_prev = pts
|
||||
self.desc_prev = desc
|
||||
|
||||
return out, N_matches
|
||||
|
||||
def mnn_mather(self, desc1, desc2):
|
||||
sim = desc1 @ desc2.transpose()
|
||||
sim[sim < 0.9] = 0
|
||||
nn12 = np.argmax(sim, axis=1)
|
||||
nn21 = np.argmax(sim, axis=0)
|
||||
ids1 = np.arange(0, sim.shape[0])
|
||||
mask = (ids1 == nn21[nn12])
|
||||
matches = np.stack([ids1[mask], nn12[mask]])
|
||||
return matches.transpose()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='ALike Demo.')
|
||||
parser.add_argument('--input', type=str, default=r'E:\caodanyang\dataset\KITTI\odometry\data_odometry_color\dataset\sequences\00\image_2',
|
||||
help='Image directory or movie file or "camera0" (for webcam0).')
|
||||
parser.add_argument('--model', choices=['alike-t', 'alike-s', 'alike-n', 'alike-l'], default="alike-t",
|
||||
help="The model configuration")
|
||||
parser.add_argument('--device', type=str, default='cuda', help="Running device (default: cuda).")
|
||||
parser.add_argument('--top_k', type=int, default=-1,
|
||||
help='Detect top K keypoints. -1 for threshold based mode, >0 for top K mode. (default: -1)')
|
||||
parser.add_argument('--scores_th', type=float, default=0.2,
|
||||
help='Detector score threshold (default: 0.2).')
|
||||
parser.add_argument('--n_limit', type=int, default=5000,
|
||||
help='Maximum number of keypoints to be detected (default: 5000).')
|
||||
parser.add_argument('--no_display', action='store_true',
|
||||
help='Do not display images to screen. Useful if running remotely (default: False).')
|
||||
parser.add_argument('--no_sub_pixel', action='store_true',
|
||||
help='Do not detect sub-pixel keypoints (default: False).')
|
||||
args = parser.parse_args()
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
image_loader = ImageLoader(args.input)
|
||||
model = ALike(**configs[args.model],
|
||||
device=args.device,
|
||||
top_k=args.top_k,
|
||||
scores_th=args.scores_th,
|
||||
n_limit=args.n_limit)
|
||||
tracker = SimpleTracker()
|
||||
|
||||
if not args.no_display:
|
||||
logging.info("Press 'q' to stop!")
|
||||
cv2.namedWindow(args.model)
|
||||
|
||||
runtime = []
|
||||
progress_bar = tqdm(image_loader)
|
||||
for img in progress_bar:
|
||||
if img is None:
|
||||
break
|
||||
|
||||
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
|
||||
pred = model(img_rgb, sub_pixel=not args.no_sub_pixel)
|
||||
kpts = pred['keypoints']
|
||||
desc = pred['descriptors']
|
||||
runtime.append(pred['time'])
|
||||
|
||||
out, N_matches = tracker.update(img, kpts, desc)
|
||||
|
||||
ave_fps = (1. / np.stack(runtime)).mean()
|
||||
status = f"Fps:{ave_fps:.1f}, Keypoints/Matches: {len(kpts)}/{N_matches}"
|
||||
progress_bar.set_description(status)
|
||||
|
||||
if not args.no_display:
|
||||
cv2.setWindowTitle(args.model, args.model + ': ' + status)
|
||||
cv2.imshow(args.model, out)
|
||||
if cv2.waitKey(1) == ord('q'):
|
||||
break
|
||||
|
||||
logging.info('Finished!')
|
||||
if not args.no_display:
|
||||
logging.info('Press any key to exit!')
|
||||
cv2.waitKey()
|
||||
BIN
ALIKE/models/alike-l.pth
Normal file
BIN
ALIKE/models/alike-l.pth
Normal file
Binary file not shown.
BIN
ALIKE/models/alike-n.pth
Normal file
BIN
ALIKE/models/alike-n.pth
Normal file
Binary file not shown.
BIN
ALIKE/models/alike-s.pth
Normal file
BIN
ALIKE/models/alike-s.pth
Normal file
Binary file not shown.
BIN
ALIKE/models/alike-t.pth
Normal file
BIN
ALIKE/models/alike-t.pth
Normal file
Binary file not shown.
6
ALIKE/requirements.txt
Normal file
6
ALIKE/requirements.txt
Normal file
@@ -0,0 +1,6 @@
|
||||
opencv-python~=4.5.1.48
|
||||
numpy~=1.19.5
|
||||
tqdm~=4.60.0
|
||||
torch~=1.8.0
|
||||
torchvision~=0.9.0
|
||||
thop~=0.0.31-2005241907
|
||||
194
ALIKE/soft_detect.py
Normal file
194
ALIKE/soft_detect.py
Normal file
@@ -0,0 +1,194 @@
|
||||
import torch
|
||||
from torch import nn
|
||||
import torch.nn.functional as F
|
||||
|
||||
|
||||
# coordinates system
|
||||
# ------------------------------> [ x: range=-1.0~1.0; w: range=0~W ]
|
||||
# | -----------------------------
|
||||
# | | |
|
||||
# | | |
|
||||
# | | |
|
||||
# | | image |
|
||||
# | | |
|
||||
# | | |
|
||||
# | | |
|
||||
# | |---------------------------|
|
||||
# v
|
||||
# [ y: range=-1.0~1.0; h: range=0~H ]
|
||||
|
||||
def simple_nms(scores, nms_radius: int):
|
||||
""" Fast Non-maximum suppression to remove nearby points """
|
||||
assert (nms_radius >= 0)
|
||||
|
||||
def max_pool(x):
|
||||
return torch.nn.functional.max_pool2d(
|
||||
x, kernel_size=nms_radius * 2 + 1, stride=1, padding=nms_radius)
|
||||
|
||||
zeros = torch.zeros_like(scores)
|
||||
max_mask = scores == max_pool(scores)
|
||||
|
||||
for _ in range(2):
|
||||
supp_mask = max_pool(max_mask.float()) > 0
|
||||
supp_scores = torch.where(supp_mask, zeros, scores)
|
||||
new_max_mask = supp_scores == max_pool(supp_scores)
|
||||
max_mask = max_mask | (new_max_mask & (~supp_mask))
|
||||
return torch.where(max_mask, scores, zeros)
|
||||
|
||||
|
||||
def sample_descriptor(descriptor_map, kpts, bilinear_interp=False):
|
||||
"""
|
||||
:param descriptor_map: BxCxHxW
|
||||
:param kpts: list, len=B, each is Nx2 (keypoints) [h,w]
|
||||
:param bilinear_interp: bool, whether to use bilinear interpolation
|
||||
:return: descriptors: list, len=B, each is NxD
|
||||
"""
|
||||
batch_size, channel, height, width = descriptor_map.shape
|
||||
|
||||
descriptors = []
|
||||
for index in range(batch_size):
|
||||
kptsi = kpts[index] # Nx2,(x,y)
|
||||
|
||||
if bilinear_interp:
|
||||
descriptors_ = torch.nn.functional.grid_sample(descriptor_map[index].unsqueeze(0), kptsi.view(1, 1, -1, 2),
|
||||
mode='bilinear', align_corners=True)[0, :, 0, :] # CxN
|
||||
else:
|
||||
kptsi = (kptsi + 1) / 2 * kptsi.new_tensor([[width - 1, height - 1]])
|
||||
kptsi = kptsi.long()
|
||||
descriptors_ = descriptor_map[index, :, kptsi[:, 1], kptsi[:, 0]] # CxN
|
||||
|
||||
descriptors_ = torch.nn.functional.normalize(descriptors_, p=2, dim=0)
|
||||
descriptors.append(descriptors_.t())
|
||||
|
||||
return descriptors
|
||||
|
||||
|
||||
class DKD(nn.Module):
|
||||
def __init__(self, radius=2, top_k=0, scores_th=0.2, n_limit=20000):
|
||||
"""
|
||||
Args:
|
||||
radius: soft detection radius, kernel size is (2 * radius + 1)
|
||||
top_k: top_k > 0: return top k keypoints
|
||||
scores_th: top_k <= 0 threshold mode: scores_th > 0: return keypoints with scores>scores_th
|
||||
else: return keypoints with scores > scores.mean()
|
||||
n_limit: max number of keypoint in threshold mode
|
||||
"""
|
||||
super().__init__()
|
||||
self.radius = radius
|
||||
self.top_k = top_k
|
||||
self.scores_th = scores_th
|
||||
self.n_limit = n_limit
|
||||
self.kernel_size = 2 * self.radius + 1
|
||||
self.temperature = 0.1 # tuned temperature
|
||||
self.unfold = nn.Unfold(kernel_size=self.kernel_size, padding=self.radius)
|
||||
|
||||
# local xy grid
|
||||
x = torch.linspace(-self.radius, self.radius, self.kernel_size)
|
||||
# (kernel_size*kernel_size) x 2 : (w,h)
|
||||
self.hw_grid = torch.stack(torch.meshgrid([x, x])).view(2, -1).t()[:, [1, 0]]
|
||||
|
||||
def detect_keypoints(self, scores_map, sub_pixel=True):
|
||||
b, c, h, w = scores_map.shape
|
||||
scores_nograd = scores_map.detach()
|
||||
# nms_scores = simple_nms(scores_nograd, self.radius)
|
||||
nms_scores = simple_nms(scores_nograd, 2)
|
||||
|
||||
# remove border
|
||||
nms_scores[:, :, :self.radius + 1, :] = 0
|
||||
nms_scores[:, :, :, :self.radius + 1] = 0
|
||||
nms_scores[:, :, h - self.radius:, :] = 0
|
||||
nms_scores[:, :, :, w - self.radius:] = 0
|
||||
|
||||
# detect keypoints without grad
|
||||
if self.top_k > 0:
|
||||
topk = torch.topk(nms_scores.view(b, -1), self.top_k)
|
||||
indices_keypoints = topk.indices # B x top_k
|
||||
else:
|
||||
if self.scores_th > 0:
|
||||
masks = nms_scores > self.scores_th
|
||||
if masks.sum() == 0:
|
||||
th = scores_nograd.reshape(b, -1).mean(dim=1) # th = self.scores_th
|
||||
masks = nms_scores > th.reshape(b, 1, 1, 1)
|
||||
else:
|
||||
th = scores_nograd.reshape(b, -1).mean(dim=1) # th = self.scores_th
|
||||
masks = nms_scores > th.reshape(b, 1, 1, 1)
|
||||
masks = masks.reshape(b, -1)
|
||||
|
||||
indices_keypoints = [] # list, B x (any size)
|
||||
scores_view = scores_nograd.reshape(b, -1)
|
||||
for mask, scores in zip(masks, scores_view):
|
||||
indices = mask.nonzero(as_tuple=False)[:, 0]
|
||||
if len(indices) > self.n_limit:
|
||||
kpts_sc = scores[indices]
|
||||
sort_idx = kpts_sc.sort(descending=True)[1]
|
||||
sel_idx = sort_idx[:self.n_limit]
|
||||
indices = indices[sel_idx]
|
||||
indices_keypoints.append(indices)
|
||||
|
||||
keypoints = []
|
||||
scoredispersitys = []
|
||||
kptscores = []
|
||||
if sub_pixel:
|
||||
# detect soft keypoints with grad backpropagation
|
||||
patches = self.unfold(scores_map) # B x (kernel**2) x (H*W)
|
||||
self.hw_grid = self.hw_grid.to(patches) # to device
|
||||
for b_idx in range(b):
|
||||
patch = patches[b_idx].t() # (H*W) x (kernel**2)
|
||||
indices_kpt = indices_keypoints[b_idx] # one dimension vector, say its size is M
|
||||
patch_scores = patch[indices_kpt] # M x (kernel**2)
|
||||
|
||||
# max is detached to prevent undesired backprop loops in the graph
|
||||
max_v = patch_scores.max(dim=1).values.detach()[:, None]
|
||||
x_exp = ((patch_scores - max_v) / self.temperature).exp() # M * (kernel**2), in [0, 1]
|
||||
|
||||
# \frac{ \sum{(i,j) \times \exp(x/T)} }{ \sum{\exp(x/T)} }
|
||||
xy_residual = x_exp @ self.hw_grid / x_exp.sum(dim=1)[:, None] # Soft-argmax, Mx2
|
||||
|
||||
hw_grid_dist2 = torch.norm((self.hw_grid[None, :, :] - xy_residual[:, None, :]) / self.radius,
|
||||
dim=-1) ** 2
|
||||
scoredispersity = (x_exp * hw_grid_dist2).sum(dim=1) / x_exp.sum(dim=1)
|
||||
|
||||
# compute result keypoints
|
||||
keypoints_xy_nms = torch.stack([indices_kpt % w, indices_kpt // w], dim=1) # Mx2
|
||||
keypoints_xy = keypoints_xy_nms + xy_residual
|
||||
keypoints_xy = keypoints_xy / keypoints_xy.new_tensor(
|
||||
[w - 1, h - 1]) * 2 - 1 # (w,h) -> (-1~1,-1~1)
|
||||
|
||||
kptscore = torch.nn.functional.grid_sample(scores_map[b_idx].unsqueeze(0),
|
||||
keypoints_xy.view(1, 1, -1, 2),
|
||||
mode='bilinear', align_corners=True)[0, 0, 0, :] # CxN
|
||||
|
||||
keypoints.append(keypoints_xy)
|
||||
scoredispersitys.append(scoredispersity)
|
||||
kptscores.append(kptscore)
|
||||
else:
|
||||
for b_idx in range(b):
|
||||
indices_kpt = indices_keypoints[b_idx] # one dimension vector, say its size is M
|
||||
keypoints_xy_nms = torch.stack([indices_kpt % w, indices_kpt // w], dim=1) # Mx2
|
||||
keypoints_xy = keypoints_xy_nms / keypoints_xy_nms.new_tensor(
|
||||
[w - 1, h - 1]) * 2 - 1 # (w,h) -> (-1~1,-1~1)
|
||||
kptscore = torch.nn.functional.grid_sample(scores_map[b_idx].unsqueeze(0),
|
||||
keypoints_xy.view(1, 1, -1, 2),
|
||||
mode='bilinear', align_corners=True)[0, 0, 0, :] # CxN
|
||||
keypoints.append(keypoints_xy)
|
||||
scoredispersitys.append(None)
|
||||
kptscores.append(kptscore)
|
||||
|
||||
return keypoints, scoredispersitys, kptscores
|
||||
|
||||
def forward(self, scores_map, descriptor_map, sub_pixel=False):
|
||||
"""
|
||||
:param scores_map: Bx1xHxW
|
||||
:param descriptor_map: BxCxHxW
|
||||
:param sub_pixel: whether to use sub-pixel keypoint detection
|
||||
:return: kpts: list[Nx2,...]; kptscores: list[N,....] normalised position: -1.0 ~ 1.0
|
||||
"""
|
||||
keypoints, scoredispersitys, kptscores = self.detect_keypoints(scores_map,
|
||||
sub_pixel)
|
||||
|
||||
descriptors = sample_descriptor(descriptor_map, keypoints, sub_pixel)
|
||||
|
||||
# keypoints: B M 2
|
||||
# descriptors: B M D
|
||||
# scoredispersitys:
|
||||
return keypoints, descriptors, kptscores, scoredispersitys
|
||||
393
BEVNet.py
Normal file
393
BEVNet.py
Normal file
@@ -0,0 +1,393 @@
|
||||
import math
|
||||
import torch
|
||||
import torch._utils
|
||||
import torch.nn as nn
|
||||
from typing import Optional, Callable
|
||||
from torchvision.models import resnet
|
||||
|
||||
|
||||
class RIConv2d(nn.Module):
|
||||
def __init__(self, in_channel, out_channel, kernel_size=1, stride=1, padding=0, bias=True):
|
||||
super().__init__()
|
||||
self.padding = padding
|
||||
self.stride = stride
|
||||
self.use_bias = bias
|
||||
idx = torch.arange(kernel_size ** 2).view(-1, 1)
|
||||
row = torch.div(idx, kernel_size, rounding_mode='floor')
|
||||
col = torch.fmod(idx, kernel_size)
|
||||
idx = torch.cat([row, col], dim=1)
|
||||
dis = (idx - 0.5 * (kernel_size - 1)).norm(dim=1) + 0.5 * (kernel_size % 2 - 1)
|
||||
dis = dis.view(kernel_size, kernel_size)
|
||||
dis = torch.round(dis).long()
|
||||
dis[dis > 0.5 * (kernel_size - 1)] = -1
|
||||
self.mask = dis
|
||||
self.number = int(torch.max(dis).item() + 1)
|
||||
self.weight = torch.zeros([kernel_size, kernel_size, out_channel, in_channel])
|
||||
if bias:
|
||||
self.bias = torch.nn.Parameter(torch.rand([out_channel, ]))
|
||||
else:
|
||||
self.bias = None
|
||||
self.weight1 = torch.nn.Parameter(torch.rand([self.number, out_channel, in_channel]))
|
||||
|
||||
def forward(self, x):
|
||||
weight = self.weight.to(self.weight1.device)
|
||||
for i in range(self.number):
|
||||
mask = self.mask == i
|
||||
weight[mask] = self.weight1[i]
|
||||
weight = weight.permute(2, 3, 0, 1)
|
||||
y = torch.nn.functional.conv2d(x, weight, self.bias, self.stride, self.padding)
|
||||
return y
|
||||
|
||||
def __repr__(self):
|
||||
return f"RIConv2d(in_channel={self.weight.shape[3]}, out_channel={self.weight.shape[2]}," \
|
||||
f" kernel_size={self.weight.shape[0]}, stride={self.stride}, padding={self.padding}, bias={self.bias is not None})"
|
||||
|
||||
|
||||
class RIMaxpool2d(nn.Module):
|
||||
def __init__(self, kernel_size=1, stride=1, padding=0):
|
||||
super().__init__()
|
||||
self.kernel_size = kernel_size
|
||||
self.stride = stride
|
||||
self.padding = padding
|
||||
idx = torch.arange(kernel_size ** 2).view(-1, 1)
|
||||
row = torch.div(idx, kernel_size, rounding_mode='floor')
|
||||
col = torch.fmod(idx, kernel_size)
|
||||
idx = torch.cat([row, col], dim=1)
|
||||
dis = (idx - 0.5 * (kernel_size - 1)).norm(dim=1) + 0.5 * (kernel_size % 2 - 1)
|
||||
dis = dis.view(kernel_size, kernel_size)
|
||||
dis = torch.round(dis)
|
||||
dis[dis > 0.5 * (kernel_size - 1)] = -1
|
||||
self.mask = dis.view(-1, ) > -1
|
||||
|
||||
def forward(self, x):
|
||||
B, C, H, W = x.shape
|
||||
h_out = math.floor((H + 2 * self.padding - (self.kernel_size - 1) - 1) / self.stride + 1)
|
||||
w_out = math.floor((W + 2 * self.padding - (self.kernel_size - 1) - 1) / self.stride + 1)
|
||||
unfold_x = torch.nn.functional.unfold(x, kernel_size=self.kernel_size, stride=self.stride, padding=self.padding)
|
||||
y = unfold_x.view(B, C, self.kernel_size * self.kernel_size, h_out, w_out)
|
||||
y = y.permute(2, 0, 1, 3, 4)
|
||||
y1 = y[self.mask]
|
||||
y_max = torch.max(y1, dim=0, keepdim=False)[0]
|
||||
return y_max
|
||||
|
||||
def __repr__(self):
|
||||
return f"RIMaxpool2d(kernel_size={self.kernel_size}, stride={self.stride}, padding={self.padding})"
|
||||
|
||||
|
||||
class RIAvgpool2d(nn.Module):
|
||||
def __init__(self, kernel_size=1, stride=1, padding=0):
|
||||
super().__init__()
|
||||
self.padding = padding
|
||||
self.stride = stride
|
||||
idx = torch.arange(kernel_size ** 2).view(-1, 1)
|
||||
row = torch.div(idx, kernel_size, rounding_mode='floor')
|
||||
col = torch.fmod(idx, kernel_size)
|
||||
idx = torch.cat([row, col], dim=1)
|
||||
dis = (idx - 0.5 * (kernel_size - 1)).norm(dim=1) + 0.5 * (kernel_size % 2 - 1)
|
||||
dis = dis.view(kernel_size, kernel_size)
|
||||
dis = torch.round(dis)
|
||||
dis[dis > 0.5 * (kernel_size - 1)] = -1
|
||||
mask = dis > -1
|
||||
self.number = torch.sum(mask)
|
||||
self.weight = torch.zeros([kernel_size, kernel_size, 1, 1])
|
||||
self.weight[mask] = 1
|
||||
|
||||
def forward(self, x):
|
||||
weight = self.weight.to(x.device)
|
||||
weight = weight.permute(2, 3, 0, 1)
|
||||
weight = weight.repeat(x.shape[1], 1, 1, 1)
|
||||
sum = torch.nn.functional.conv2d(x, weight, None, self.stride, self.padding, groups=x.shape[1])
|
||||
avg = sum / self.number
|
||||
return avg
|
||||
|
||||
def __repr__(self):
|
||||
return f"RIAvgpool2d(kernel_size={self.weight.shape[0]}, stride={self.stride}, padding={self.padding})"
|
||||
|
||||
|
||||
class RIConvBlock(nn.Module):
|
||||
def __init__(self, in_channels, out_channels,
|
||||
gate: Optional[Callable[..., nn.Module]] = None,
|
||||
norm_layer: Optional[Callable[..., nn.Module]] = None):
|
||||
super().__init__()
|
||||
if gate is None:
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
else:
|
||||
self.gate = gate
|
||||
if norm_layer is None:
|
||||
norm_layer = nn.BatchNorm2d
|
||||
self.conv1 = RIConv2d(in_channel=in_channels, out_channel=out_channels, kernel_size=5, padding=2, bias=False)
|
||||
self.bn1 = norm_layer(out_channels)
|
||||
self.conv2 = RIConv2d(in_channel=out_channels, out_channel=out_channels, kernel_size=5, padding=2, bias=False)
|
||||
self.bn2 = norm_layer(out_channels)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.gate(self.bn1(self.conv1(x))) # B x in_channels x H x W
|
||||
x = self.gate(self.bn2(self.conv2(x))) # B x out_channels x H x W
|
||||
return x
|
||||
|
||||
class RIResBlock(nn.Module):
|
||||
expansion: int = 1
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
inplanes: int,
|
||||
planes: int,
|
||||
stride: int = 1,
|
||||
downsample: Optional[nn.Module] = None,
|
||||
groups: int = 1,
|
||||
base_width: int = 64,
|
||||
dilation: int = 1,
|
||||
gate: Optional[Callable[..., nn.Module]] = None,
|
||||
norm_layer: Optional[Callable[..., nn.Module]] = None
|
||||
) -> None:
|
||||
super(RIResBlock, self).__init__()
|
||||
if gate is None:
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
else:
|
||||
self.gate = gate
|
||||
if norm_layer is None:
|
||||
norm_layer = nn.BatchNorm2d
|
||||
if groups != 1 or base_width != 64:
|
||||
raise ValueError('ResBlock only supports groups=1 and base_width=64')
|
||||
if dilation > 1:
|
||||
raise NotImplementedError("Dilation > 1 not supported in ResBlock")
|
||||
# Both self.conv1 and self.downsample layers downsample the input when stride != 1
|
||||
self.conv1 = RIConv2d(in_channel=inplanes, out_channel=planes, kernel_size=5, stride=1, padding=2, bias=False)
|
||||
self.bn1 = norm_layer(planes)
|
||||
self.conv2 = RIConv2d(in_channel=planes, out_channel=planes, kernel_size=5, stride=1, padding=2, bias=False)
|
||||
self.bn2 = norm_layer(planes)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
||||
identity = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.gate(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(x)
|
||||
|
||||
out += identity
|
||||
out = self.gate(out)
|
||||
|
||||
return out
|
||||
|
||||
class RICNN(nn.Module):
|
||||
def __init__(self, c1: int = 8, c2: int = 16, c3: int = 32, c4: int = 64, dim: int = 64
|
||||
):
|
||||
super().__init__()
|
||||
|
||||
self.gate = nn.ReLU(inplace=True)
|
||||
self.pool2 = RIMaxpool2d(kernel_size=2, stride=2)
|
||||
self.pool4 = RIMaxpool2d(kernel_size=5, stride=4, padding=1)
|
||||
self.block1 = RIConvBlock(3, c1, self.gate, nn.BatchNorm2d)
|
||||
self.block2 = RIResBlock(inplanes=c1, planes=c2, stride=1,
|
||||
downsample=nn.Conv2d(c1, c2, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
self.block3 = RIResBlock(inplanes=c2, planes=c3, stride=1,
|
||||
downsample=nn.Conv2d(c2, c3, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
self.block4 = RIResBlock(inplanes=c3, planes=c4, stride=1,
|
||||
downsample=nn.Conv2d(c3, c4, 1),
|
||||
gate=self.gate,
|
||||
norm_layer=nn.BatchNorm2d)
|
||||
|
||||
self.conv1 = resnet.conv1x1(c1, dim // 4)
|
||||
self.conv2 = resnet.conv1x1(c2, dim // 4)
|
||||
self.conv3 = resnet.conv1x1(c3, dim // 4)
|
||||
self.conv4 = resnet.conv1x1(dim, dim // 4)
|
||||
self.upsample2 = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
|
||||
self.upsample3 = nn.Upsample(scale_factor=8, mode='bilinear', align_corners=True)
|
||||
self.upsample4 = nn.Upsample(scale_factor=32, mode='bilinear', align_corners=True)
|
||||
self.out = nn.Conv2d(dim, dim + 1, 1)
|
||||
|
||||
def forward(self, image):
|
||||
x1 = self.block1(image)
|
||||
x2 = self.pool2(x1)
|
||||
x2 = self.block2(x2)
|
||||
x3 = self.pool4(x2)
|
||||
x3 = self.block3(x3)
|
||||
x4 = self.pool4(x3)
|
||||
x4 = self.block4(x4)
|
||||
x1 = self.gate(self.conv1(x1))
|
||||
x2 = self.gate(self.conv2(x2))
|
||||
x3 = self.gate(self.conv3(x3))
|
||||
x4 = self.gate(self.conv4(x4))
|
||||
x2_up = self.upsample2(x2)
|
||||
x3_up = self.upsample3(x3)
|
||||
x4_up = self.upsample4(x4)
|
||||
x1234 = torch.cat([x1, x2_up, x3_up, x4_up], dim=1)
|
||||
y = self.out(x1234)
|
||||
descriptor_map = y[:, :-1, :, :]
|
||||
scores_map = torch.sigmoid(y[:, -1, :, :]).unsqueeze(1)
|
||||
return scores_map, descriptor_map
|
||||
|
||||
def ri2maxpool(self, pool):
|
||||
stride = pool.stride
|
||||
pool_new = nn.MaxPool2d(stride)
|
||||
return pool_new
|
||||
|
||||
def maxpool2ri(self, pool):
|
||||
kernel_size = stride = pool.stride
|
||||
ds = round((math.sqrt(2) - 1) / 2 * stride - 0.25 * (stride % 2 - 1))
|
||||
kernel_size = kernel_size + ds
|
||||
pool_new = RIMaxpool2d(kernel_size, stride, ds)
|
||||
return pool_new
|
||||
|
||||
def ri2avgpool(self, pool):
|
||||
stride = pool.stride
|
||||
pool_new = nn.AvgPool2d(stride)
|
||||
return pool_new
|
||||
|
||||
def avgpool2ri(self, pool):
|
||||
kernel_size = stride = pool.stride
|
||||
if stride > 3:
|
||||
kernel_size = kernel_size + 1
|
||||
pool_new = RIAvgpool2d(kernel_size, stride)
|
||||
return pool_new
|
||||
|
||||
def ri2conv(self, conv):
|
||||
ri = conv
|
||||
weight = ri.weight
|
||||
device = ri.weight1.device
|
||||
bias = ri.bias
|
||||
use_bias = bias is not None
|
||||
weight_copy = weight.clone().to(device)
|
||||
for i in range(ri.number):
|
||||
mask = ri.mask == i
|
||||
weight_copy[mask] = ri.weight1[i]
|
||||
weight_copy = weight_copy.permute(2, 3, 0, 1)
|
||||
in_c = weight.shape[3]
|
||||
out_c = weight.shape[2]
|
||||
kz = weight.shape[0]
|
||||
sd = ri.stride
|
||||
pd = ri.padding
|
||||
conv_new = nn.Conv2d(in_channels=in_c, out_channels=out_c, kernel_size=kz, stride=sd, padding=pd, bias=use_bias)
|
||||
if use_bias:
|
||||
state_dict = {'weight': weight_copy, 'bias': bias}
|
||||
else:
|
||||
state_dict = {'weight': weight_copy}
|
||||
conv_new.load_state_dict(state_dict)
|
||||
return conv_new.to(device)
|
||||
|
||||
def conv2ri(self, conv):
|
||||
weight = conv.weight
|
||||
bias = conv.bias
|
||||
device = weight.device
|
||||
in_c = weight.shape[1]
|
||||
out_c = weight.shape[0]
|
||||
kz = weight.shape[2]
|
||||
if kz < 3:
|
||||
return conv
|
||||
sd = conv.stride
|
||||
pd = conv.padding
|
||||
idx = torch.arange(kz ** 2).view(-1, 1)
|
||||
row = torch.div(idx, kz, rounding_mode='floor')
|
||||
col = torch.fmod(idx, kz)
|
||||
idx = torch.cat([row, col], dim=1)
|
||||
dis = (idx - 0.5 * (kz - 1)).norm(dim=1) + 0.5 * (kz % 2 - 1)
|
||||
dis = dis.view(kz, kz)
|
||||
dis = torch.round(dis).long()
|
||||
dis[dis > 0.5 * (kz - 1)] = -1
|
||||
mask = dis
|
||||
number = int(torch.max(dis).item() + 1)
|
||||
weight1 = torch.rand([number, out_c, in_c]).to(device)
|
||||
weight2 = weight.clone()
|
||||
weight2 = weight2.permute(2, 3, 0, 1)
|
||||
used_bias = bias is not None
|
||||
for i in range(number):
|
||||
mask1 = mask == i
|
||||
w = weight2[mask1]
|
||||
weight1[i] = torch.mean(w, dim=0)
|
||||
if used_bias:
|
||||
state_dict = {'weight1': weight1, 'bias': bias}
|
||||
else:
|
||||
state_dict = {'weight1': weight1}
|
||||
conv_new = RIConv2d(in_channel=in_c, out_channel=out_c, kernel_size=kz, stride=sd, padding=pd, bias=used_bias)
|
||||
conv_new.load_state_dict(state_dict)
|
||||
return conv_new.to(device)
|
||||
|
||||
def disable_ri(self):
|
||||
modules = self.__dict__['_modules']
|
||||
for key, value in modules.items():
|
||||
if isinstance(value, RIMaxpool2d):
|
||||
setattr(self, key, self.ri2maxpool(value))
|
||||
if isinstance(value, RIAvgpool2d):
|
||||
setattr(self, key, self.ri2avgpool(value))
|
||||
if isinstance(value, RIConv2d):
|
||||
setattr(self, key, self.ri2conv(value))
|
||||
if 'block' in key:
|
||||
block = value
|
||||
block_modules = block.__dict__['_modules']
|
||||
for bkey, bvalue in block_modules.items():
|
||||
if isinstance(bvalue, RIMaxpool2d):
|
||||
setattr(block, bkey, self.ri2maxpool(bvalue))
|
||||
if isinstance(bvalue, RIAvgpool2d):
|
||||
setattr(block, bkey, self.ri2avgpool(bvalue))
|
||||
if isinstance(bvalue, RIConv2d):
|
||||
setattr(block, bkey, self.ri2conv(bvalue))
|
||||
modules[key] = block_modules
|
||||
setattr(self, key, block)
|
||||
|
||||
def enable_ri(self):
|
||||
modules = self.__dict__['_modules']
|
||||
for key, value in modules.items():
|
||||
if isinstance(value, nn.MaxPool2d):
|
||||
setattr(self, key, self.maxpool2ri(value))
|
||||
if isinstance(value, nn.AvgPool2d):
|
||||
setattr(self, key, self.avgpool2ri(value))
|
||||
if isinstance(value, nn.Conv2d):
|
||||
setattr(self, key, self.conv2ri(value))
|
||||
if 'block' in key:
|
||||
block = value
|
||||
block_modules = block.__dict__['_modules']
|
||||
for bkey, bvalue in block_modules.items():
|
||||
if isinstance(bvalue, nn.MaxPool2d):
|
||||
setattr(block, bkey, self.maxpool2ri(bvalue))
|
||||
if isinstance(bvalue, nn.AvgPool2d):
|
||||
setattr(block, bkey, self.avgpool2ri(bvalue))
|
||||
if isinstance(bvalue, nn.Conv2d):
|
||||
setattr(block, bkey, self.conv2ri(bvalue))
|
||||
modules[key] = block_modules
|
||||
setattr(self, key, block)
|
||||
|
||||
|
||||
class EncodePosition(nn.Module):
|
||||
def __init__(self, feature_size=128):
|
||||
super().__init__()
|
||||
self.bins = 16
|
||||
self.conv1 = nn.Sequential(
|
||||
nn.Conv1d(in_channels=self.bins, out_channels=feature_size//2, kernel_size=1, stride=1, padding=0, bias=True), nn.BatchNorm1d(feature_size//2), nn.ReLU(),
|
||||
nn.Conv1d(in_channels=feature_size//2, out_channels=feature_size//2, kernel_size=1, stride=1, padding=0, bias=True), nn.BatchNorm1d(feature_size//2), nn.ReLU(),
|
||||
nn.Conv1d(in_channels=feature_size//2, out_channels=feature_size, kernel_size=1, stride=1, padding=0, bias=True)
|
||||
)
|
||||
# self.conv2=(nn.Conv1d(in_channels=256,out_channels=128,kernel_size=1))
|
||||
|
||||
def forward(self, x, fea):
|
||||
b, n, c = x.shape
|
||||
x1 = x.unsqueeze(1)
|
||||
x2 = x.unsqueeze(2)
|
||||
dx = x1 - x2
|
||||
distance = dx.norm(p=2, dim=3)
|
||||
hists = torch.zeros([b, n, self.bins]).to(x.device)
|
||||
for i in range(b):
|
||||
for j in range(n):
|
||||
dis = distance[i, j]
|
||||
hist = torch.histc(dis, bins=self.bins, min=1, max=80)
|
||||
hists[i, j] = hist
|
||||
hists = hists / torch.sum(hists, dim=2, keepdim=True)
|
||||
x3 = hists.permute(0, 2, 1)
|
||||
x4 = self.conv1(x3)
|
||||
if hasattr(self, 'conv2'):
|
||||
x5 = torch.cat([fea, x4], dim=1)
|
||||
y = self.conv2(x5)
|
||||
else:
|
||||
y = fea + x4
|
||||
return y
|
||||
42
README.md
Normal file
42
README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# FUSION
|
||||
|
||||
## Table of Contents
|
||||
- [Paper](#paper)
|
||||
- [Overview](#overview)
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Running the Code](#running-the-code)
|
||||
- [Evaluation](#evaluation)
|
||||
## Paper
|
||||
If you find the poject helps you, you can cite our paper:
|
||||
|
||||
Cao D, Yue H, Liu Z, et al. BEVLCD+: Real-Time and Rotation-Invariant Loop Closure Detection Based on BEV of Point Cloud[J]. IEEE Transactions on Instrumentation and Measurement, 2023.
|
||||
|
||||
Yue H, Cao D, Liu Z, et al. Cross Fusion of Point Cloud and Learned Image for Loop Closure Detection[J]. IEEE Robotics and Automation Letters, 2024.
|
||||
|
||||
## Overview
|
||||
We provide code for BEV mode and fusion mode, so you can easily train and test.
|
||||
|
||||
## Prerequisites
|
||||
Before you can use this project, you'll need to do the following:
|
||||
|
||||
1. **Download Datasets**: Download the [KITTI](https://www.cvlibs.net/datasets/kitti/eval_odometry.php) and [KITTI-360](https://www.cvlibs.net/datasets/kitti-360/download.php).
|
||||
|
||||
2. **Prepare Dataset Structure**: Use `preparedataset.py` to construct a dataset structure that complies with the project's requirements. Make sure to update the necessary paths in the code.
|
||||
|
||||
3. **Prepare environment**: Use the commonds on `env.txt` to create your environment. Windows and Ubuntu is OK.
|
||||
|
||||
## Running the Code
|
||||
To run the code, follow these steps:
|
||||
|
||||
1. Configure the code to run in either BEV mode or fusion mode using the settings in `config.yaml`.
|
||||
|
||||
2. If you want to load a trained model used in the paper, ensure that you update the file path accordingly.
|
||||
|
||||
3. Run `python train.py`
|
||||
|
||||
|
||||
## Evaluation
|
||||
Evaluate the saved data using the evaluation script.
|
||||
|
||||
## Others
|
||||
If you have any questions please feel free to contact us.
|
||||
BIN
__pycache__/BEVNet.cpython-38.pyc
Normal file
BIN
__pycache__/BEVNet.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/dataset.cpython-38.pyc
Normal file
BIN
__pycache__/dataset.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/evaluate_lcd.cpython-38.pyc
Normal file
BIN
__pycache__/evaluate_lcd.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/loss.cpython-38.pyc
Normal file
BIN
__pycache__/loss.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/net.cpython-38.pyc
Normal file
BIN
__pycache__/net.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/netvlad.cpython-38.pyc
Normal file
BIN
__pycache__/netvlad.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/tools.cpython-38.pyc
Normal file
BIN
__pycache__/tools.cpython-38.pyc
Normal file
Binary file not shown.
BIN
__pycache__/uot.cpython-38.pyc
Normal file
BIN
__pycache__/uot.cpython-38.pyc
Normal file
Binary file not shown.
56
config.yaml
Normal file
56
config.yaml
Normal file
@@ -0,0 +1,56 @@
|
||||
'experiment' :
|
||||
|
||||
# 'path_dataset' : '/mnt/data/cdy/project/dataset/FUSION'
|
||||
# 'path_result': '/mnt/data/cdy/data2/results/FUSIONLCD'
|
||||
|
||||
# 'path_dataset' : 'E:\work\Project\dataset\FUSION'
|
||||
# 'path_result' : 'E:\work\Project\results\FUSIONLCD\bev2'
|
||||
|
||||
'path_dataset' : '/home/adlab36/chenyouyuan/FUSIONLCD'
|
||||
'path_result': '/home/adlab36/chenyouyuan/FUSIONLCD/result'
|
||||
'train_flag' : 0
|
||||
'validate_flag' : 1
|
||||
'test_flag' : 1
|
||||
'flag' : 'fusion'
|
||||
'cuda' : 1
|
||||
# TRAINING
|
||||
'epochs' : 200
|
||||
'batchsize' : 6
|
||||
'learning_rate' : 1.e-3
|
||||
'beta1' : 0.9
|
||||
'beta2' : 0.999
|
||||
'eps' : 1.e-8
|
||||
'weight_decay' : 5.e-6
|
||||
'load_model' : 1
|
||||
#FUSION
|
||||
# 'last_model' : '/data4/caodanyang/results/FUSIONLCD/08310/models/checkpoint_079.pth.tar'
|
||||
#BEV
|
||||
# 'last_model' : '/data4/caodanyang/results/FUSIONLCD/bev_09030/models/checkpoint_066.pth.tar'
|
||||
#BEV+EP
|
||||
'last_model' : '/home/adlab36/chenyouyuan/FUSIONLCD/result/log/models/checkpoint_199.pth.tar'
|
||||
#DATASET
|
||||
'train' : 0,5,6,7,9
|
||||
'validate' : 8,50,54,55,56,59
|
||||
'test' : 8,50,54,55,56,59
|
||||
'voxel_num' : 15000
|
||||
'voxel_max_points' : 100
|
||||
'voxel_sample' : 'top'
|
||||
# 'bev_range' : -51.2,-51.2,-2.5,51.2,51.2,1.5
|
||||
# 'bev_resolution' : 0.16
|
||||
# 'bev_range' : -64,-64,-2.5,64,64,1.5
|
||||
# 'bev_resolution' : 0.2
|
||||
'bev_range' : -32,-32,-2.5,32,32,1.5
|
||||
'bev_resolution' : 0.2
|
||||
|
||||
# NETWORK PARAMS
|
||||
'kpts_number_bev' : 150
|
||||
'kpts_number_img' : 150
|
||||
'cluster_num_bev' : 16
|
||||
'cluster_num_img' : 16
|
||||
'cluster_num_fusion' : 16
|
||||
'sinkhorn_iter' : 5
|
||||
'vlad_size' : 256
|
||||
# LOSS
|
||||
'loop_file' : 'loop_GT_4m'
|
||||
'trip_margin' : 0.5
|
||||
'negetative_selsector' : 'random'
|
||||
751
dataset.py
Normal file
751
dataset.py
Normal file
@@ -0,0 +1,751 @@
|
||||
import glob
|
||||
import math
|
||||
import os
|
||||
import pickle
|
||||
from functools import reduce
|
||||
import matplotlib.pylab as plt
|
||||
import cv2
|
||||
import numba
|
||||
import numpy as np
|
||||
import torch
|
||||
import yaml
|
||||
from scipy.spatial.distance import cdist
|
||||
from torch.utils.data import Dataset, DataLoader, ConcatDataset
|
||||
from torch.utils.data.dataloader import default_collate
|
||||
from torch.nn.utils.rnn import pad_sequence
|
||||
|
||||
import tools
|
||||
|
||||
IMG_HEIGHT = 384
|
||||
IMG_WIDTH = 1152
|
||||
EGEG_PROJ = 10
|
||||
IMAGE_SCALE = 0.5
|
||||
|
||||
def euler2mat(z, y, x):
|
||||
Ms = []
|
||||
if z:
|
||||
cosz = math.cos(z)
|
||||
sinz = math.sin(z)
|
||||
Ms.append(np.array(
|
||||
[[cosz, -sinz, 0],
|
||||
[sinz, cosz, 0],
|
||||
[0, 0, 1]]))
|
||||
if y:
|
||||
cosy = math.cos(y)
|
||||
siny = math.sin(y)
|
||||
Ms.append(np.array(
|
||||
[[cosy, 0, siny],
|
||||
[0, 1, 0],
|
||||
[-siny, 0, cosy]]))
|
||||
if x:
|
||||
cosx = math.cos(x)
|
||||
sinx = math.sin(x)
|
||||
Ms.append(np.array(
|
||||
[[1, 0, 0],
|
||||
[0, cosx, -sinx],
|
||||
[0, sinx, cosx]]))
|
||||
if Ms:
|
||||
return reduce(np.dot, Ms[::-1])
|
||||
return np.eye(3)
|
||||
|
||||
|
||||
def rt_mat(rx, ry, rz, tx, ty, tz):
|
||||
rt = np.eye(4, dtype=np.float32)
|
||||
r = euler2mat(rz, ry, rx)
|
||||
rt[0:3, 0:3] = r
|
||||
rt[0:3, 3] = [tx, ty, tz]
|
||||
return rt
|
||||
|
||||
@numba.jit(nopython=True)
|
||||
def _points_to_voxel_reverse_kernel(points,
|
||||
voxel_size,
|
||||
coors_range,
|
||||
num_points_per_voxel,
|
||||
coor_to_voxelidx,
|
||||
voxels,
|
||||
coors,
|
||||
max_points,
|
||||
max_voxels,voxel_idx_empty,voxel_mamiz):
|
||||
# put all computations to one loop.
|
||||
# we shouldn't create large array in main jit code, otherwise
|
||||
# reduce performance
|
||||
N = points.shape[0]
|
||||
# ndim = points.shape[1] - 1
|
||||
ndim = 3
|
||||
ndim_minus_1 = ndim - 1
|
||||
grid_size = (coors_range[3:] - coors_range[:3]) / voxel_size
|
||||
# np.round(grid_size)
|
||||
# grid_size = np.round(grid_size).astype(np.int64)(np.int32)
|
||||
grid_size = np.round(grid_size, 0, grid_size).astype(np.int32)
|
||||
coor = np.zeros(shape=(3,), dtype=np.int32)
|
||||
voxel_num = 0
|
||||
failed = False
|
||||
for i in range(N):
|
||||
failed = False
|
||||
for j in range(ndim):
|
||||
c = np.floor((points[i, j] - coors_range[j]) / voxel_size[j])
|
||||
if c < 0 or c >= grid_size[j]:
|
||||
failed = True
|
||||
break
|
||||
coor[ndim_minus_1 - j] = c
|
||||
if failed:
|
||||
continue
|
||||
voxelidx = coor_to_voxelidx[coor[0], coor[1], coor[2]]#0-15000
|
||||
if voxelidx == -1:
|
||||
voxelidx = voxel_num
|
||||
if voxel_num >= max_voxels:
|
||||
break
|
||||
voxel_num += 1
|
||||
coor_to_voxelidx[coor[0], coor[1], coor[2]] = voxelidx
|
||||
coors[voxelidx] = coor
|
||||
num = num_points_per_voxel[voxelidx]
|
||||
if num < max_points:
|
||||
voxel_idx_empty[voxelidx,num_points_per_voxel[voxelidx]]=1
|
||||
if points[i,2]>voxel_mamiz[voxelidx,0]:
|
||||
voxel_mamiz[voxelidx,0]=points[i,2]
|
||||
if points[i,2]<voxel_mamiz[voxelidx,1]:
|
||||
voxel_mamiz[voxelidx,1]=points[i,2]
|
||||
voxels[voxelidx, num] = points[i]
|
||||
num_points_per_voxel[voxelidx] += 1
|
||||
|
||||
return voxel_num
|
||||
|
||||
|
||||
def points_to_voxel(points,
|
||||
voxel_size,
|
||||
coors_range,
|
||||
max_points=35,
|
||||
reverse_index=True,
|
||||
max_voxels=20000):
|
||||
"""convert kitti points(N, >=3) to voxels. This version calculate
|
||||
everything in one loop. now it takes only 4.2ms(complete point cloud)
|
||||
with jit and 3.2ghz cpu.(don't calculate other features)
|
||||
Note: this function in ubuntu seems faster than windows 10.
|
||||
|
||||
Args:
|
||||
points: [N, ndim] float tensor. points[:, :3] contain xyz points and
|
||||
points[:, 3:] contain other information such as reflectivity.
|
||||
voxel_size: [3] list/tuple or array, float. xyz, indicate voxel size
|
||||
coors_range: [6] list/tuple or array, float. indicate voxel range.
|
||||
format: xyzxyz, minmax
|
||||
max_points: int. indicate maximum points contained in a voxel.
|
||||
reverse_index: boolean. indicate whether return reversed coordinates.
|
||||
if points has xyz format and reverse_index is True, output
|
||||
coordinates will be zyx format, but points in features always
|
||||
xyz format.
|
||||
max_voxels: int. indicate maximum voxels this function create.
|
||||
for second, 20000 is a good choice. you should shuffle points
|
||||
before call this function because max_voxels may drop some points.
|
||||
|
||||
Returns:
|
||||
voxels: [M, max_points, ndim] float tensor. only contain points.
|
||||
coordinates: [M, 3] int32 tensor.
|
||||
num_points_per_voxel: [M] int32 tensor.
|
||||
"""
|
||||
if not isinstance(voxel_size, np.ndarray):
|
||||
voxel_size = np.array(voxel_size, dtype=points.dtype)
|
||||
if not isinstance(coors_range, np.ndarray):
|
||||
coors_range = np.array(coors_range, dtype=points.dtype)
|
||||
voxelmap_shape = (coors_range[3:] - coors_range[:3]) / voxel_size
|
||||
voxelmap_shape = tuple(np.round(voxelmap_shape).astype(np.int32).tolist())
|
||||
if reverse_index:
|
||||
voxelmap_shape = voxelmap_shape[::-1]
|
||||
# don't create large array in jit(nopython=True) code.
|
||||
num_points_per_voxel = np.zeros(shape=(max_voxels,), dtype=np.int32)
|
||||
coor_to_voxelidx = -np.ones(shape=voxelmap_shape, dtype=np.int32)
|
||||
voxels = np.zeros(shape=(max_voxels, max_points, points.shape[-1]), dtype=points.dtype)
|
||||
coors = np.zeros(shape=(max_voxels, 3), dtype=np.int32)
|
||||
voxel_idx_empty = np.zeros((max_voxels,max_points),dtype=np.int32)
|
||||
voxel_mamiz=np.zeros((max_voxels,2),dtype=points.dtype)
|
||||
voxel_mamiz[:,0]=-99
|
||||
voxel_mamiz[:,1]=99
|
||||
voxel_num = _points_to_voxel_reverse_kernel(
|
||||
points, voxel_size, coors_range, num_points_per_voxel,
|
||||
coor_to_voxelidx, voxels, coors, max_points, max_voxels,voxel_idx_empty,voxel_mamiz)
|
||||
|
||||
# coors = coors[:voxel_num]
|
||||
# voxels = voxels[:voxel_num]
|
||||
# num_points_per_voxel = num_points_per_voxel[:voxel_num]
|
||||
# voxels[:, :, -3:] = voxels[:, :, :3] - \
|
||||
# voxels[:, :, :3].sum(axis=1, keepdims=True)/num_points_per_voxel.reshape(-1, 1, 1)
|
||||
return voxels, coors, num_points_per_voxel, coor_to_voxelidx,voxel_idx_empty,voxel_mamiz
|
||||
|
||||
|
||||
@numba.jit(nopython=True)
|
||||
def pixel_choose(pixel1, pixel2):
|
||||
n = pixel1.shape[0]
|
||||
k = pixel1.shape[1]
|
||||
k1 = pixel2.shape[1]
|
||||
for i in range(n):
|
||||
idx = []
|
||||
for j in range(k):
|
||||
if pixel1[i, j, 0] > -1:
|
||||
idx.append(j)
|
||||
k2 = len(idx)
|
||||
idx=np.asarray(idx)
|
||||
if k2 >= k1:
|
||||
choice = np.random.choice(idx, k1, replace=False)
|
||||
else:
|
||||
choice = np.random.choice(idx, k1, replace=True)
|
||||
for j in range(k1):
|
||||
pixel2[i,j] = pixel1[i, choice[j]]
|
||||
|
||||
def pointcloud_encoder(pointcloud=None, cfg=None):
|
||||
try:
|
||||
resolution = cfg['bev_resolution']
|
||||
pointcloud_range = [float(x) for x in tools.read_cfg(cfg['bev_range'])]
|
||||
voxel_max_points = cfg['voxel_max_points']
|
||||
voxel_num = cfg['voxel_num']
|
||||
voxel_sample = cfg['voxel_sample']
|
||||
except:
|
||||
resolution = 0.2
|
||||
pointcloud_range = [-40, -40, -2.5, 40, 40, 1.5]
|
||||
voxel_max_points = 100
|
||||
voxel_num = 15000
|
||||
voxel_sample = 'top'
|
||||
|
||||
pc_filter = (pointcloud[:, 0] > pointcloud_range[0]) & (pointcloud[:, 0] < pointcloud_range[3]) \
|
||||
& (pointcloud[:, 1] > pointcloud_range[1]) & (pointcloud[:, 1] < pointcloud_range[4]) \
|
||||
& (pointcloud[:, 2] > pointcloud_range[2]) & (pointcloud[:, 2] < pointcloud_range[5])
|
||||
pointcloud = pointcloud[pc_filter]
|
||||
if voxel_sample == 'top':
|
||||
idx = np.argsort(-pointcloud[:, 2])
|
||||
pointcloud = pointcloud[idx]
|
||||
else:
|
||||
idx = np.arange(len(pointcloud))
|
||||
np.random.shuffle(idx)
|
||||
pointcloud = pointcloud[idx]
|
||||
resolution_z=pointcloud_range[5]-pointcloud_range[2]
|
||||
voxels, coors, num_points_per_voxel, coor_to_voxelidx,voxel_idx_empty,voxel_mamiz = points_to_voxel(pointcloud,
|
||||
voxel_size=[
|
||||
resolution, resolution,resolution_z],
|
||||
coors_range=[
|
||||
*pointcloud_range],
|
||||
max_points=voxel_max_points,
|
||||
max_voxels=voxel_num)
|
||||
|
||||
|
||||
coor_to_voxelidx = np.squeeze(coor_to_voxelidx)
|
||||
voxel_idx_empty=voxel_idx_empty.astype(np.bool_)
|
||||
|
||||
voxels_center = np.sum(voxels, axis=1)
|
||||
voxels_center[:, 0:4] = voxels_center[:, 0:4] / (num_points_per_voxel.reshape(-1, 1)+1e-8)
|
||||
|
||||
max_z = voxel_mamiz[:,0]
|
||||
min_z = voxel_mamiz[:,1]
|
||||
max_z1 = max_z / resolution_z
|
||||
mean_i = np.sum(voxels[:, :, 3], axis=1) / (num_points_per_voxel+1e-8)
|
||||
density = np.log(np.clip(num_points_per_voxel, 1, None)) / np.log(cfg['voxel_max_points'])
|
||||
|
||||
|
||||
dz = max_z - min_z
|
||||
idx_not_ground = (dz > 0.05) & (num_points_per_voxel > 1)
|
||||
coors1 = coors[idx_not_ground, 1:3]
|
||||
relation = 0
|
||||
if voxels.shape[2] == 6: # x,y,z,i,pu,pv
|
||||
have_pixel = (voxels[:, :, 4] > 0) & (voxels[:, :, 5] > 0) # each 3d point has a pixel
|
||||
have_pixel = np.bitwise_or.reduce(have_pixel, axis=1) # each cell of bev has pixels,num of pixels may less than num of 3d points
|
||||
|
||||
# num_voxels_pixel = np.sum(voxels_pixel, axis=1)
|
||||
# num_voxels_pixel = np.clip(num_voxels_pixel, 1, None)
|
||||
# voxels_center[:, 4:6] = np.int_(voxels_center[:, 4:6] / num_voxels_pixel.reshape(-1, 1))
|
||||
feature = np.concatenate([max_z1, mean_i, density,
|
||||
voxels_center[:, 0], voxels_center[:, 1], voxels_center[:, 2], voxels_center[:, 3]],
|
||||
axis=0).reshape(7, -1).T
|
||||
pixels = voxels[idx_not_ground & have_pixel, :, 4:6]
|
||||
# pixels2 = np.zeros([pixels.shape[0], 10, 2], dtype=pixels.dtype)
|
||||
# pixel_choose(pixels, pixels2)
|
||||
# pixels = pixels2
|
||||
|
||||
coors2 = coors[idx_not_ground & have_pixel, 1:3].reshape(-1, 1, 2)
|
||||
relation = np.hstack((pixels, coors2))
|
||||
|
||||
elif voxels.shape[2] == 4: # x,y,z,i
|
||||
feature = np.concatenate([max_z1, mean_i, density,
|
||||
voxels_center[:, 0], voxels_center[:, 1], voxels_center[:, 2], voxels_center[:, 3]],
|
||||
axis=0).reshape(7, -1).T
|
||||
else:
|
||||
print('ERROR VOXEL')
|
||||
exit()
|
||||
# eigvalues = voxel_svd(voxels)
|
||||
# feature = np.hstack((feature, eigvalues))
|
||||
bev = np.zeros([coor_to_voxelidx.shape[1], coor_to_voxelidx.shape[1], feature.shape[1]], dtype=np.float32)
|
||||
bev[coors1[:, 0], coors1[:, 1]] = feature[idx_not_ground]
|
||||
# bev[coors1[:, 0], coors1[:, 1]] = feature[idx_not_ground]
|
||||
# bev_show=np.uint8(bev[:,:,0:3]*255)
|
||||
# cv2.imshow('1',bev_show)
|
||||
# cv2.waitKey(0)
|
||||
|
||||
return bev, relation
|
||||
|
||||
|
||||
def crop_image(img, height=IMG_HEIGHT, width=IMG_WIDTH):
|
||||
# img: the original image, a numpy array of size H*W*C
|
||||
# height: the target height
|
||||
# width: the target width
|
||||
|
||||
# get the size of the original image
|
||||
h, w, _ = img.shape
|
||||
|
||||
# calculate the padding size if necessary
|
||||
if h < height:
|
||||
pad_top = (height - h) // 2
|
||||
pad_bottom = height - h - pad_top
|
||||
else:
|
||||
pad_top, pad_bottom = 0, 0
|
||||
if w < width:
|
||||
pad_left = (width - w) // 2
|
||||
pad_right = width - w - pad_left
|
||||
else:
|
||||
pad_left, pad_right = 0, 0
|
||||
|
||||
# pad the original image with black pixels if necessary
|
||||
img_padded = cv2.copyMakeBorder(img, pad_top, pad_bottom, pad_left, pad_right, cv2.BORDER_CONSTANT, value=(0, 0, 0))
|
||||
|
||||
# crop the padded image to the target size
|
||||
h1 = (img_padded.shape[0] - height) // 2
|
||||
h2 = h1 + height
|
||||
w1 = (img_padded.shape[1] - width) // 2
|
||||
w2 = w1 + width
|
||||
img_cropped = img_padded[h1:h2, w1:w2]
|
||||
dh = int((height - h) / 2)
|
||||
dw = int((width - w) / 2)
|
||||
if IMAGE_SCALE<1:
|
||||
img_cropped = cv2.resize(img_cropped, None, None, IMAGE_SCALE, IMAGE_SCALE)
|
||||
dh = int(dh * IMAGE_SCALE)
|
||||
dw = int(dw * IMAGE_SCALE)
|
||||
|
||||
return img_cropped, dh, dw
|
||||
|
||||
|
||||
class KittiDataset(Dataset):
|
||||
def __init__(self, cfg, sequence, argument=True, mode='train', flag_bev=True, flag_img=True, flag_fuse=True):
|
||||
root_dataset = cfg['path_dataset']
|
||||
if (flag_img == False) & (flag_bev == False) & (flag_fuse == False):
|
||||
print('No module will be used!')
|
||||
exit()
|
||||
if (flag_img == False) or (flag_bev == False):
|
||||
flag_fuse = False
|
||||
self.flag_img = flag_img
|
||||
self.flag_bev = flag_bev
|
||||
self.flag_fuse = flag_fuse
|
||||
self.cfg = cfg
|
||||
self.sequence = sequence
|
||||
self.mode = mode
|
||||
scans = glob.glob(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'velodyne', "*.bin"))
|
||||
images = glob.glob(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'image_2', "*.png"))
|
||||
if int(sequence) >= 50:
|
||||
calib = np.loadtxt(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'calib.txt'))
|
||||
poses = os.path.join(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'poses.npy'))
|
||||
else:
|
||||
calib = np.genfromtxt(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'calib.txt'))[:, 1:]
|
||||
poses = os.path.join(os.path.join(root_dataset, 'sequences', '%02d' % sequence, 'poses.txt'))
|
||||
f_gt = open(os.path.join(root_dataset, 'sequences', '%02d' % sequence, cfg['loop_file'] + '.pickle'), 'rb')
|
||||
scans.sort()
|
||||
images.sort()
|
||||
self.scans = scans
|
||||
self.images = images
|
||||
if int(sequence) >= 50:
|
||||
cam0_to_velo = np.reshape(calib, (3, 4))
|
||||
cam0_to_velo = np.vstack([cam0_to_velo, [0, 0, 0, 1]])
|
||||
cam0_to_velo = np.linalg.inv(cam0_to_velo)
|
||||
self.cam0_to_velo = cam0_to_velo.astype(np.float32)
|
||||
k = [552.554261, 0.000000, 682.049453, 0.000000,
|
||||
0.000000, 552.554261, 238.769549, 0.000000,
|
||||
0.000000, 0.000000, 1.000000, 0.000000]
|
||||
k = np.array(k).reshape([3, 4])
|
||||
p2 = np.eye(4)
|
||||
p2[:3] = k
|
||||
self.p2 = p2.astype(np.float32)
|
||||
poses2 = np.load(poses)
|
||||
else:
|
||||
p2 = np.reshape(calib[2], (3, 4))
|
||||
p2 = np.vstack([p2, [0, 0, 0, 1]])
|
||||
self.p2 = p2.astype(np.float32)
|
||||
cam0_to_velo = np.reshape(calib[4], (3, 4))
|
||||
cam0_to_velo = np.vstack([cam0_to_velo, [0, 0, 0, 1]])
|
||||
self.cam0_to_velo = cam0_to_velo.astype(np.float32)
|
||||
cam0_to_velo = torch.tensor(cam0_to_velo)
|
||||
poses2 = []
|
||||
with open(poses, 'r') as f:
|
||||
for x in f:
|
||||
x = x.strip().split()
|
||||
x = [float(v) for v in x]
|
||||
pose = torch.zeros((4, 4), dtype=torch.float64)
|
||||
pose[0, 0:4] = torch.tensor(x[0:4])
|
||||
pose[1, 0:4] = torch.tensor(x[4:8])
|
||||
pose[2, 0:4] = torch.tensor(x[8:12])
|
||||
pose[3, 3] = 1.0
|
||||
pose = cam0_to_velo.inverse() @ (pose @ cam0_to_velo) #
|
||||
poses2.append(pose.float().numpy())
|
||||
poses2 = np.stack(poses2)
|
||||
# for i in range(12):
|
||||
# plt.subplot(3, 4, i + 1), plt.plot(np.arange(len(poses2)), poses2[:, i // 4, i % 4])
|
||||
# plt.show()
|
||||
self.poses = poses2
|
||||
|
||||
gt = pickle.load(f_gt)
|
||||
self.gt=gt
|
||||
# gt_new=[]
|
||||
# for i in range(len(gt)):
|
||||
# idx=gt[i]['idx']
|
||||
# positive_idxs=gt[i]['positive_idxs']
|
||||
# for j in positive_idxs:
|
||||
# sample={'idx':idx,'positive_idxs':[j]}
|
||||
# gt_new.append(sample)
|
||||
# self.gt=gt_new
|
||||
self.argument = argument
|
||||
|
||||
def __len__(self):
|
||||
if self.mode == 'test':
|
||||
return len(self.poses)
|
||||
else:
|
||||
return int(len(self.gt))
|
||||
|
||||
def __getitem__(self, idx):
|
||||
if self.mode == 'test':
|
||||
idx_query = idx
|
||||
pose_query = self.poses[idx_query]
|
||||
image_query, scan_query, bev_query, dw, dh, W, H, relation_query = 0, 0, 0, 0, 0, 0, 0, 0
|
||||
if self.flag_img:
|
||||
image_query = cv2.imread(self.images[idx_query])
|
||||
# image_query = cv2.GaussianBlur(image_query, (15,15),0)
|
||||
image_query, dh, dw = crop_image(image_query, IMG_HEIGHT, IMG_WIDTH)
|
||||
if self.flag_bev:
|
||||
print("DEBUG __len__ scans:", len(self.scans), "requested idx_query:", idx_query)
|
||||
scan_query = np.fromfile(self.scans[idx_query], dtype=np.float32).reshape((-1, 4))
|
||||
# idx = np.random.choice(len(scan_query), int(len(scan_query) /4), replace=False)
|
||||
# scan_query = scan_query[idx]
|
||||
if self.flag_bev & self.flag_img & self.flag_fuse:
|
||||
# mat_proj = np.matmul(self.p2, self.cam0_to_velo)
|
||||
mat_proj = torch.matmul(torch.from_numpy(self.p2), torch.from_numpy(self.cam0_to_velo)).numpy()
|
||||
pts_query = scan_query.copy()
|
||||
pts_query[:, 3] = 1
|
||||
# pts_proj_query = np.matmul(mat_proj, pts_query.T).T
|
||||
pts_proj_query = torch.matmul(torch.from_numpy(mat_proj), torch.from_numpy(pts_query.T)).numpy().T
|
||||
z = pts_proj_query[:, 2:3]
|
||||
pts_proj_query = pts_proj_query / z * IMAGE_SCALE
|
||||
pts_proj_query[:, 0:2] = pts_proj_query[:, 0:2] + [dw, dh]
|
||||
H, W, _ = image_query.shape
|
||||
mask_query = (pts_proj_query[:, 0] >= EGEG_PROJ) & (pts_proj_query[:, 0] < W - EGEG_PROJ) & (
|
||||
pts_proj_query[:, 1] >= EGEG_PROJ) & (pts_proj_query[:, 1] < H - EGEG_PROJ) & (z[:, 0] >= 0)
|
||||
pts_proj_query[~mask_query] = -1
|
||||
pixel_query = pts_proj_query[:, 0:2]
|
||||
pixel_query = pixel_query[:, [1, 0]]
|
||||
scan_query = np.hstack((scan_query, pixel_query))
|
||||
bev_query, relation_query = pointcloud_encoder(scan_query, self.cfg)
|
||||
sample = {
|
||||
'sequence': self.sequence,
|
||||
'id_query': idx_query,
|
||||
'bev_query': bev_query,
|
||||
'img_query': image_query,
|
||||
'pose_query': pose_query,
|
||||
'relation_query': relation_query
|
||||
}
|
||||
else:
|
||||
gt = self.gt[idx]
|
||||
idx_query = gt['idx']
|
||||
idx_ps = gt['positive_idxs']
|
||||
idx_positive = np.random.choice(idx_ps)
|
||||
pose_query = self.poses[idx_query]
|
||||
pose_positive = self.poses[idx_positive]
|
||||
image_query, scan_query, bev_query, image_positive, scan_query, bev_positive, dw, dh, W, H, pose_to_frame, \
|
||||
label_score, relation_query, relation_positive, = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
if self.flag_img:
|
||||
image_query = cv2.imread(self.images[idx_query])
|
||||
image_query, dh, dw = crop_image(image_query)
|
||||
image_positive = cv2.imread(self.images[idx_positive])
|
||||
image_positive, _, _ = crop_image(image_positive)
|
||||
if self.flag_bev:
|
||||
scan_query = np.fromfile(self.scans[idx_query], dtype=np.float32).reshape((-1, 4))
|
||||
|
||||
scan_positive = np.fromfile(self.scans[idx_positive], dtype=np.float32).reshape((-1, 4))
|
||||
|
||||
# return {'1':np.zeros(5)}
|
||||
# import open3d as o3d
|
||||
# pcd1 = o3d.geometry.PointCloud()
|
||||
# sc1 = scan_query.copy()
|
||||
# sc1[:, 3] = 1
|
||||
# sc1 = np.matmul(pose_query, sc1.T).T
|
||||
# pcd1.points = o3d.utility.Vector3dVector(sc1[:, :3])
|
||||
# pcd1.colors = o3d.utility.Vector3dVector([[0, 0, 1] for i in range(len(pcd1.points))])
|
||||
# pcd2 = o3d.geometry.PointCloud()
|
||||
# sc2 = scan_positive.copy()
|
||||
# sc2[:, 3] = 1
|
||||
# sc2 = np.matmul(pose_positive, sc2.T).T
|
||||
# pcd2.points = o3d.utility.Vector3dVector(sc2[:, :3])
|
||||
# pcd2.colors = o3d.utility.Vector3dVector([[0, 1, 0] for i in range(len(pcd2.points))])
|
||||
# vis1 = o3d.visualization.Visualizer()
|
||||
# vis1.create_window(window_name='registration', width=600, height=600) # 创建窗口
|
||||
# render_option: o3d.visualization.RenderOption = vis1.get_render_option() # 设置点云渲染参数
|
||||
# render_option.background_color = np.array([1, 1, 1]) # 设置背景色(这里为黑色)
|
||||
# render_option.point_size = 2 # 设置渲染点的大小
|
||||
# vis1.add_geometry(pcd1)
|
||||
# vis1.add_geometry(pcd2)
|
||||
# coord_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(size=1.0, origin=pose_query[0:3,3])
|
||||
# vis1.add_geometry(coord_frame)
|
||||
# vis1.run()
|
||||
|
||||
if self.argument:
|
||||
# Rt1 = np.eye(4)
|
||||
# idx1=np.random.random((len(scan_query),))
|
||||
# idx1=idx1>0.1
|
||||
# scan_query=scan_query[idx1]
|
||||
# idx2=np.random.random((len(scan_positive),))
|
||||
# idx2=idx2>0.1
|
||||
# scan_positive=scan_positive[idx2]
|
||||
rand = np.random.random(6) * 2 - 1
|
||||
Rt = rt_mat(rand[0] * 3 / 180 * np.pi,
|
||||
rand[1] * 3 / 180 * np.pi,
|
||||
rand[2] * 180 / 180 * np.pi,
|
||||
rand[3] * 3, rand[4] * 3, rand[5] * 0.3)
|
||||
Rt1 = torch.from_numpy(Rt).inverse().numpy()
|
||||
ints = scan_positive[:, 3].copy()
|
||||
scan_positive[:, 3] = 1
|
||||
# scan_positive = np.matmul(Rt, scan_positive.T).T
|
||||
scan_positive = torch.matmul(torch.from_numpy(Rt),torch.from_numpy(scan_positive.T)).numpy().T
|
||||
scan_positive[:, 3] = ints
|
||||
# pose_positive = np.matmul(pose_positive, np.linalg.inv(Rt))
|
||||
pose_positive=torch.matmul(torch.from_numpy(pose_positive),torch.from_numpy(Rt).inverse()).numpy()
|
||||
else:
|
||||
Rt1 = np.eye(4).astype(np.float32)
|
||||
if self.flag_fuse:
|
||||
# mat_proj = np.matmul(self.p2, self.cam0_to_velo)
|
||||
mat_proj = torch.matmul(torch.from_numpy(self.p2), torch.from_numpy(self.cam0_to_velo)).numpy()
|
||||
pts_query = scan_query.copy()
|
||||
pts_query[:, 3] = 1
|
||||
# pts_proj_query = np.matmul(mat_proj, pts_query.T).T
|
||||
pts_proj_query = torch.matmul(torch.from_numpy(mat_proj), torch.from_numpy(pts_query.T)).numpy().T
|
||||
z = pts_proj_query[:, 2:3]
|
||||
pts_proj_query = pts_proj_query / z * IMAGE_SCALE
|
||||
pts_proj_query[:, 0:2] = pts_proj_query[:, 0:2] + [dw, dh]
|
||||
H, W, _ = image_query.shape
|
||||
mask_query = (pts_proj_query[:, 0] >= EGEG_PROJ) & (pts_proj_query[:, 0] < W - EGEG_PROJ) & (
|
||||
pts_proj_query[:, 1] >= EGEG_PROJ) & (pts_proj_query[:, 1] < H - EGEG_PROJ) & (z[:, 0] >= 0)
|
||||
pts_proj_query[~mask_query] = -1
|
||||
pixel_query = pts_proj_query[:, 0:2]
|
||||
pixel_query = pixel_query[:, [1, 0]] # h,w
|
||||
scan_query = np.hstack((scan_query, pixel_query))
|
||||
|
||||
# fig = plt.figure()
|
||||
# plt.subplot(2, 1, 1), plt.imshow(image_query[:, :, [2, 1, 0]])
|
||||
# plt.subplot(2, 1, 2), plt.imshow(image_query[:, :, [2, 1, 0]])
|
||||
# plt.scatter(pixel_query[mask_query, 1], pixel_query[mask_query, 0], c=z[mask_query], cmap='jet', alpha=0.5, s=1)
|
||||
# plt.show()
|
||||
|
||||
# mat_proj1 = self.p2.dot(self.cam0_to_velo).dot(Rt1)
|
||||
mat_proj1 = torch.matmul(torch.matmul(torch.from_numpy(self.p2), torch.from_numpy(self.cam0_to_velo)), torch.from_numpy(Rt1)).numpy()
|
||||
pts_positive = scan_positive.copy()
|
||||
pts_positive[:, 3] = 1
|
||||
# pts_proj_positive = np.matmul(mat_proj1, pts_positive.T).T
|
||||
pts_proj_positive = torch.matmul(torch.from_numpy(mat_proj1), torch.from_numpy(pts_positive.T)).numpy().T
|
||||
z = pts_proj_positive[:, 2:3]
|
||||
pts_proj_positive = pts_proj_positive / z * IMAGE_SCALE
|
||||
pts_proj_positive[:, 0:2] = pts_proj_positive[:, 0:2] + [dw, dh]
|
||||
mask_positive = (pts_proj_positive[:, 0] >= EGEG_PROJ) & (pts_proj_positive[:, 0] < W - EGEG_PROJ) & (
|
||||
pts_proj_positive[:, 1] >= 0 - EGEG_PROJ) & (pts_proj_positive[:, 1] < H - EGEG_PROJ) & (z[:, 0] >= 0)
|
||||
pts_proj_positive[~mask_positive] = -1
|
||||
pixel_positive = pts_proj_positive[:, 0:2]
|
||||
pixel_positive = pixel_positive[:, [1, 0]]
|
||||
scan_positive = np.hstack((scan_positive, pixel_positive))
|
||||
|
||||
# pose_to_frame = np.matmul(np.linalg.inv(pose_positive), pose_query)
|
||||
pose_to_frame=torch.matmul(torch.from_numpy(pose_positive).inverse(),torch.from_numpy(pose_query)).numpy()
|
||||
# scan_query1=scan_query.copy()
|
||||
# scan_query1[:,3]=1
|
||||
# scan_query1=np.matmul(pose_to_frame,scan_query1.T).T
|
||||
# scan_query[:,0:3]=scan_query1[:,0:3]
|
||||
# scan_query[:, 3] = 1
|
||||
# scan_query = np.matmul(pose_query, scan_query.T).T
|
||||
# scan_positive[:, 3] = 1
|
||||
# scan_positive = np.matmul(pose_positive, scan_positive.T).T
|
||||
# plt.subplot(1, 2, 1), plt.plot(scan_query[:, 0], scan_query[:, 1], 'b.', markersize=1),plt.axis([-60,60,-60,60])
|
||||
# plt.subplot(1, 2, 2), plt.plot(scan_positive[:, 0], scan_positive[:, 1], 'b.', markersize=1),plt.axis([-60,60,-60,60])
|
||||
# plt.show()
|
||||
|
||||
bev_query, relation_query = pointcloud_encoder(scan_query, self.cfg)
|
||||
bev_positive, relation_positive = pointcloud_encoder(scan_positive, self.cfg)
|
||||
# if self.argument:
|
||||
# rand = np.random.randint(0, 9, [2, ])
|
||||
# if rand[0] > 4:
|
||||
# bev_query1 = np.rot90(bev_query, 2, axes=(0, 1))
|
||||
# bev_query = bev_query1.copy()
|
||||
# if rand[1] > 4:
|
||||
# bev_positive1 = np.rot90(bev_positive, 2, axes=(0, 1))
|
||||
# bev_positive = bev_positive1.copy()
|
||||
h_bev, w_bev, _ = bev_query.shape
|
||||
label_score = np.zeros_like(bev_positive[:, :, :2])
|
||||
grid = np.array(np.meshgrid(np.arange(h_bev), np.arange(w_bev))).swapaxes(0, 2)
|
||||
scan_query_sample = bev_query[:, :, 3:5]
|
||||
mask_query = scan_query_sample != 0
|
||||
mask_query = mask_query[:, :, 0] | mask_query[:, :, 1]
|
||||
grid_query = grid[mask_query]
|
||||
scan_query_sample = scan_query_sample[mask_query]
|
||||
scan_positive_sample = bev_positive[:, :, 3:5]
|
||||
mask_positive = scan_positive_sample != 0
|
||||
mask_positive = mask_positive[:, :, 0] | mask_positive[:, :, 1]
|
||||
grid_positive = grid[mask_positive]
|
||||
scan_positive_sample = scan_positive_sample[mask_positive]
|
||||
scan_query_sample1 = np.hstack((scan_query_sample, scan_query_sample * 0))
|
||||
scan_query_sample1[:, 3] = 1
|
||||
# scan_query_sample1 = np.matmul(pose_to_frame, scan_query_sample1.T).T
|
||||
scan_query_sample1 = torch.matmul(torch.from_numpy(pose_to_frame), torch.from_numpy(scan_query_sample1.T)).numpy().T
|
||||
|
||||
idx1, idx2, dis = tools.nn_match(scan_query_sample1[:, 0:2], scan_positive_sample[:, 0:2], 'euclidean')
|
||||
if len(dis) > 50:
|
||||
th1 = max([2, dis[min([256, int(len(dis) * 0.3)])]])
|
||||
idx1 = idx1[dis < th1]
|
||||
idx2 = idx2[dis < th1]
|
||||
else:
|
||||
dis = cdist(scan_query_sample1[:, 0:2], scan_positive_sample)
|
||||
min1 = np.min(dis, axis=1)
|
||||
min2 = np.min(dis, axis=0)
|
||||
min11 = np.sort(min1)
|
||||
th1 = max([0.2, min11[min([256, int(len(min11) * 0.2)])]])
|
||||
min21 = np.sort(min2)
|
||||
th2 = max([0.2, min21[min([256, int(len(min21) * 0.2)])]])
|
||||
idx1 = np.arange(len(scan_query_sample))[min1 < th1]
|
||||
idx2 = np.arange(len(scan_positive_sample))[min2 < th2]
|
||||
|
||||
# points1=scan_query_sample1[:, 0:2]
|
||||
# points2=scan_positive_sample[:, 0:2]
|
||||
# points = np.mean(np.vstack((points1, points2)), axis=0, keepdims=True)
|
||||
# points1 = points1 - points
|
||||
# points2 = points2 - points
|
||||
# af = torch.sum(torch.from_numpy(points1) ** 2, -1, keepdim=True)
|
||||
# bf = torch.sum(torch.from_numpy(points2) ** 2, -1, keepdim=True).transpose(0, 1)
|
||||
# cf = af + bf - 2 * torch.mm(torch.from_numpy(points1), torch.from_numpy(points2).transpose(0, 1)) # c^2=a^2+b^2-2abcos
|
||||
# c = torch.sqrt(cf)
|
||||
# dis = c
|
||||
# dis1 = torch.min(dis, dim=1)[0]
|
||||
# dis2 = torch.min(dis, dim=0)[0]
|
||||
# idx1 = torch.where(dis1 < 0.5)[0].numpy()
|
||||
# idx2 = torch.where(dis2 < 0.5)[0].numpy()
|
||||
# dis1=dis1.numpy()
|
||||
# dis2= dis2.numpy()
|
||||
|
||||
grid_query = grid_query[idx1]
|
||||
grid_positive = grid_positive[idx2]
|
||||
label_score[grid_query[:, 0], grid_query[:, 1], 0] = 1
|
||||
label_score[grid_positive[:, 0], grid_positive[:, 1], 1] = 1
|
||||
# fig, ax = plt.subplots(2, 2)
|
||||
# ax[0,0].imshow(bev_query[:, :, 0:3])
|
||||
# ax[0,1].imshow(label_score[:, :, 0])
|
||||
# ax[1,0].imshow(bev_positive[:, :, 0:3])
|
||||
# ax[1,1].imshow(label_score[:, :, 1])
|
||||
# plt.savefig('1.png')
|
||||
# plt.show()
|
||||
|
||||
sample = {
|
||||
'sequence': self.sequence,
|
||||
'id_query': idx_query,
|
||||
'bev_query': bev_query,
|
||||
'img_query': image_query,
|
||||
'pose_query': pose_query,
|
||||
'id_positive': idx_positive,
|
||||
'bev_positive': bev_positive,
|
||||
'img_positive': image_positive,
|
||||
'pose_positive': pose_positive,
|
||||
'pose_to_frame': pose_to_frame,
|
||||
'label_score': label_score,
|
||||
'relation_query': relation_query,
|
||||
'relation_positive': relation_positive
|
||||
}
|
||||
return sample
|
||||
|
||||
|
||||
def collate(samples):
|
||||
relation_query = []
|
||||
relation_positive = []
|
||||
samples2 = {key: default_collate([d[key] for d in samples]) for key in samples[0]
|
||||
if key != 'relation_query' and key != 'relation_positive'}
|
||||
|
||||
for single_sample in samples:
|
||||
try:
|
||||
relation_query.append(torch.from_numpy(single_sample['relation_query']))
|
||||
except:
|
||||
pass
|
||||
try:
|
||||
relation_positive.append(torch.from_numpy(single_sample['relation_positive']))
|
||||
except:
|
||||
pass
|
||||
|
||||
relation = relation_query + relation_positive
|
||||
if len(relation) > 0:
|
||||
relation1 = pad_sequence(relation, batch_first=True, padding_value=-1)
|
||||
relation1 = relation1.float()
|
||||
samples2['relation'] = relation1
|
||||
|
||||
return samples2
|
||||
|
||||
|
||||
def KittiTotalLoader(cfg):
|
||||
flag = cfg['flag']
|
||||
bev = False
|
||||
img = False
|
||||
fuse = False
|
||||
if flag == 'fusion':
|
||||
bev = True
|
||||
img = True
|
||||
fuse = True
|
||||
elif flag == 'img':
|
||||
img = True
|
||||
else:
|
||||
bev = True
|
||||
sequence_train = [int(x) for x in tools.read_cfg(cfg['train'])]
|
||||
sequence_val = [int(x) for x in tools.read_cfg(cfg['validate'])]
|
||||
sequence_test = [int(x) for x in tools.read_cfg(cfg['test'])]
|
||||
dataset_list = []
|
||||
for sequence in sequence_train:
|
||||
single_dataset = KittiDataset(cfg, sequence, flag_bev=bev, flag_img=img, flag_fuse=fuse, argument=True, mode='train')
|
||||
print('===Trainloader add: sequence %02d, %04d files, %04d frames with loop' % (sequence, len(single_dataset.poses), len(single_dataset.gt)))
|
||||
dataset_list.append(single_dataset)
|
||||
dataset_train = ConcatDataset(dataset_list)
|
||||
dataset_list = []
|
||||
for sequence in sequence_val:
|
||||
single_dataset = KittiDataset(cfg, sequence, flag_bev=bev, flag_img=img, flag_fuse=fuse, argument=False, mode='train')
|
||||
print('===Validationloader add: sequence %02d, %04d files, %04d frames with loop' % (sequence, len(single_dataset.poses), len(single_dataset.gt)))
|
||||
dataset_list.append(single_dataset)
|
||||
dataset_val = ConcatDataset(dataset_list)
|
||||
dataset_list = []
|
||||
for sequence in sequence_test:
|
||||
single_dataset = KittiDataset(cfg, sequence, flag_bev=bev, flag_img=img, flag_fuse=fuse, argument=False, mode='test')
|
||||
print('===Testloader add: sequence %02d, %04d files' % (sequence, len(single_dataset.poses)))
|
||||
dataset_list.append(single_dataset)
|
||||
dataset_test = ConcatDataset(dataset_list)
|
||||
loader_train = DataLoader(dataset_train, batch_size=cfg['batchsize'], shuffle=True, num_workers=6, collate_fn=collate)
|
||||
loader_val = DataLoader(dataset_val, batch_size=cfg['batchsize'], shuffle=True, num_workers=6, collate_fn=collate)
|
||||
loader_test = DataLoader(dataset_test, batch_size=1, shuffle=False, num_workers=6, collate_fn=collate)
|
||||
return loader_train, loader_val, loader_test
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
root_dir = '/media/ubuntu/Workshop/caodanyang/Project_CDY/results/mylcd'
|
||||
# sequence = 0
|
||||
# dataset = KittiDataset(root_dir, sequence)
|
||||
# dataloader = DataLoader(dataset, batch_size=1, num_workers=8, shuffle=False)
|
||||
# t = tools.Timer(name='Loading', number=len(dataloader))
|
||||
# for data in dataloader:
|
||||
# t.update()
|
||||
try:
|
||||
with open(os.path.join(os.getcwd(), "config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "config.yaml"))
|
||||
except:
|
||||
with open(os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"))
|
||||
cfg = cfg['experiment']
|
||||
path_dataset = cfg['path_dataset']
|
||||
path_result = cfg['path_result']
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
train, val, test = KittiTotalLoader(cfg)
|
||||
t = tools.Timer(name='Loading')
|
||||
ds = []
|
||||
for i, data in enumerate(train):
|
||||
# ds.append(d)
|
||||
t.update(i)
|
||||
|
||||
# t = tools.Timer(name='Loading', number=len(val))
|
||||
# for i, data in enumerate(val):
|
||||
# t.update()
|
||||
59
env.txt
Normal file
59
env.txt
Normal file
@@ -0,0 +1,59 @@
|
||||
## 环境配置备忘
|
||||
conda create -n fusion_cyy python=3.8
|
||||
|
||||
conda activate fusion_cyy
|
||||
|
||||
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
|
||||
|
||||
pip install pykitti pytorch-metric-learning pyyaml scipy scikit-image scikit-learn tqdm open3d matplotlib numba opencv-python opencv-contrib-python pandas -i https://pypi.mirrors.ustc.edu.cn/simple/
|
||||
|
||||
conda remove -n fusion_cyy --all
|
||||
|
||||
在.git的config下
|
||||
|
||||
[user]
|
||||
name = MobKBK
|
||||
email = 202311250413@csust.edu.cn /home/adlab8/pub_data1/Kitti360/KITTI-360
|
||||
|
||||
tmux使用备忘:
|
||||
列表
|
||||
tmux ls
|
||||
|
||||
tmux new -s your-session-name wget -c https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_color.zip
|
||||
转后台:
|
||||
ctrl+b d
|
||||
杀死
|
||||
ctrl+d
|
||||
分离:
|
||||
tmux detach
|
||||
连接:
|
||||
tmux attach -t cyy
|
||||
切换:
|
||||
tmux switch -t your-session-name
|
||||
重命名:
|
||||
tmux rename-session -t old-session new-session
|
||||
新建窗口:
|
||||
tmux new-window -n your-window-name
|
||||
选择窗口:
|
||||
ctrl+b c: 创建一个新窗口(状态栏会显示多个窗口的信息)
|
||||
ctrl+b p: 切换到上一个窗口(按照状态栏的顺序)
|
||||
ctrl+b n: 切换到下一个窗口
|
||||
ctrl+b w: 从列表中选择窗口
|
||||
重命名窗口:
|
||||
tmux rename-window -t old_name new_name
|
||||
# 划分为上下两个窗格
|
||||
tmux split-window
|
||||
|
||||
# 划分左右两个窗格
|
||||
tmux split-window -h
|
||||
左右划分:ctrl+b %
|
||||
上下划分:ctrl+b "
|
||||
|
||||
ctrl+b [ 使用PageUp和PageDown可以实现上下翻页
|
||||
|
||||
kitti位置
|
||||
/home/adlab8/pub_data1/OpenDataLab___KITTI_Odometry_2012/raw
|
||||
kitt360
|
||||
|
||||
/home/adlab8/pub_data1/KITTI-360/
|
||||
ln -s <源文件或目录的路径> <软连接的路径>
|
||||
81
evaluate_fm.py
Normal file
81
evaluate_fm.py
Normal file
@@ -0,0 +1,81 @@
|
||||
import os
|
||||
import time
|
||||
import torch
|
||||
import tools
|
||||
import numpy as np
|
||||
import yaml
|
||||
from dataset import KittiTotalLoader
|
||||
from tqdm import tqdm
|
||||
|
||||
def match_err(fea1,fea2,pose1,pose2,kpt1,kpt2):
|
||||
pose_to_frame=torch.matmul(pose2.inverse(),pose1)
|
||||
kpt1 = (pose_to_frame @ kpt1.permute(1, 0)).permute(1, 0)
|
||||
kpt1[:,2]=0
|
||||
kpt2[:,2]=0
|
||||
idxp1,idxp2,dis_kpt=tools.nn_match(kpt1,kpt2,'euclidean')
|
||||
idxf1,idxf2,dis_fea=tools.nn_match(fea1,fea2,'cosine')
|
||||
kpt11=kpt1[idxf1]
|
||||
kpt21=kpt2[idxf2]
|
||||
dis_kpt1=(kpt11-kpt21).norm(p=2,dim=1)
|
||||
mas=[]
|
||||
mss=[]
|
||||
reps=[]
|
||||
for thr in [0.3,0.5,1,2,3]:
|
||||
mas.append((torch.sum(dis_kpt1<=thr)/(len(idxf1)+1e-8)).item())
|
||||
mss.append((torch.sum(dis_kpt1<=thr)/(torch.sum(dis_kpt<=thr)+1e-8)).item())
|
||||
reps.append((torch.sum(dis_kpt<=thr)).item()/(len(fea1)+len(fea2))*2)
|
||||
return mas,mss,reps
|
||||
|
||||
def feature_match(loader_val=None,data=None):
|
||||
if loader_val is None:
|
||||
try:
|
||||
with open(os.path.join(os.getcwd(), "config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "config.yaml"))
|
||||
except:
|
||||
with open(os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"))
|
||||
cfg = cfg['experiment']
|
||||
_, loader_val, _ = KittiTotalLoader(cfg)
|
||||
sequences=data['sequences']
|
||||
fea_kpt=data['fea_kpt'].cuda()
|
||||
poses=data['pose_query'].cuda()
|
||||
kpts=data['key_points'].cuda()
|
||||
seq=torch.unique(sequences)
|
||||
if __name__ == '__main__':
|
||||
flag = False
|
||||
else:
|
||||
flag = True
|
||||
for i in range(len(seq)):
|
||||
idx=sequences==seq[i]
|
||||
fea_kpt1=fea_kpt[idx]
|
||||
poses1=poses[idx]
|
||||
kpts1=kpts[idx]
|
||||
gt=loader_val.dataset.datasets[i].gt
|
||||
ms=[]
|
||||
for j in tqdm(range(0, len(gt)), disable=flag, ncols=60, desc='feature match'):
|
||||
query=gt[j]['idx']
|
||||
p_idxs=gt[j]['positive_idxs']
|
||||
for p in p_idxs:
|
||||
a,s,reps=match_err(fea_kpt1[query],fea_kpt1[p],poses1[query],poses1[p],kpts1[query],kpts1[p])
|
||||
ms.append(a+s+reps)
|
||||
ms=np.asarray(ms)
|
||||
ms1=np.mean(ms,axis=0)
|
||||
print('Feature matching, sequence %02d'%(seq[i]))
|
||||
print('MA@0.3:%.3f, MA@0.5:%.3f, MA@1:%.3f, MA@2:%.3f, MA@3:%.3f' %(ms1[0], ms1[1], ms1[2], ms1[3], ms1[4]))
|
||||
print('MS@0.3:%.3f, MS@0.5:%.3f, MS@1:%.3f, MS@2:%.3f, MS@3:%.3f' %(ms1[5], ms1[6], ms1[7], ms1[8], ms1[9]))
|
||||
print('RP@0.3:%.3f, RP@0.5:%.3f, RP@1:%.3f, RP@2:%.3f, RP@3:%.3f' %(ms1[10], ms1[11], ms1[12], ms1[13], ms1[14]))
|
||||
time.sleep(1)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# data = torch.load("/mnt/data2/datasets/cdy/results/FUSIONLCD/05230/database/database_149_b0.pth.tar")
|
||||
# data = torch.load("/mnt/data2/datasets/cdy/results/BEVLCD/ricnn03202/database/database_xyz_000.pth.tar")
|
||||
data= torch.load("/data4/caodanyang/results/FUSIONLCD/bev_07190/database/database_all.pth.tar")
|
||||
feature_match(data=data)
|
||||
|
||||
265
evaluate_lcd.py
Normal file
265
evaluate_lcd.py
Normal file
@@ -0,0 +1,265 @@
|
||||
import os
|
||||
import time
|
||||
import matplotlib
|
||||
# set non-interactive backend for server (must be set before pyplot import)
|
||||
matplotlib.use('Agg')
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
import torch
|
||||
from skimage.measure import ransac
|
||||
from skimage.transform import EuclideanTransform
|
||||
import tools
|
||||
from tqdm import tqdm
|
||||
from sklearn.metrics import auc
|
||||
from sklearn.neighbors import KDTree
|
||||
import warnings
|
||||
|
||||
warnings.filterwarnings("ignore")
|
||||
|
||||
|
||||
def recall_with_candidates(vlads, poses, sequence, recall_num=25, positive_distance=4):
|
||||
recall_at_k = [0] * recall_num
|
||||
num_with_loop = 0
|
||||
if __name__ == '__main__':
|
||||
flag = False
|
||||
else:
|
||||
flag = True
|
||||
for i in tqdm(range(0, len(vlads)), disable=flag, ncols=60, desc='Recall@k'):
|
||||
valid_idx = list(set(range(0, len(vlads))) - set(range(max(0, i - 50), min(len(vlads), i + 50))))
|
||||
valid_idx = torch.tensor(valid_idx).to(vlads.device)
|
||||
vlad_query = vlads[i].view(1, -1)
|
||||
vlad_valid = vlads[valid_idx]
|
||||
dis_valid = torch.linalg.norm((poses[i:i + 1, 0:3, 3] - poses[valid_idx, 0:3, 3]), dim=1)
|
||||
min_dis = torch.min(dis_valid)
|
||||
if min_dis > positive_distance:
|
||||
continue
|
||||
num_with_loop = num_with_loop + 1
|
||||
# global feature to query quickly
|
||||
dis_vlad = torch.cdist(vlad_query, vlad_valid).view(-1, )
|
||||
dis, idx_cand = torch.topk(dis_vlad, recall_num, largest=False)
|
||||
idx_cand = valid_idx[idx_cand]
|
||||
for j in range(recall_num):
|
||||
idx_cand1 = idx_cand[j]
|
||||
dis = torch.linalg.norm((poses[i:i + 1, 0:3, 3] - poses[idx_cand1, 0:3, 3]), dim=1)
|
||||
if dis <= positive_distance:
|
||||
recall_at_k[j] = recall_at_k[j] + 1
|
||||
break
|
||||
time.sleep(1)
|
||||
recall_at_k = np.cumsum(recall_at_k) / float(num_with_loop)
|
||||
print('Sequence %02d, Recall@' % sequence, end='')
|
||||
for i in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 14, 19, 24]:
|
||||
if i == (len(recall_at_k) - 1):
|
||||
print('%d[%.3f]' % (i + 1, recall_at_k[i]))
|
||||
else:
|
||||
print('%d[%.3f]' % (i + 1, recall_at_k[i]), end=', ')
|
||||
return recall_at_k
|
||||
|
||||
|
||||
def retrieve(vlads, feas, kpts, poses, num_cand=1, verify='ransac'):
|
||||
loops = []
|
||||
if __name__ == '__main__':
|
||||
flag = False
|
||||
else:
|
||||
flag = True
|
||||
ts = []
|
||||
for i in tqdm(range(0, len(feas)), disable=flag, ncols=60, desc='retrieve loop'):
|
||||
t0 = time.time()
|
||||
valid_idx = list(set(range(0, len(feas))) - set(range(max(0, i - 50), min(len(feas), i + 50))))
|
||||
valid_idx = torch.tensor(valid_idx).to(vlads.device)
|
||||
vlad_query = vlads[i].view(1, -1)
|
||||
vlad_valid = vlads[valid_idx]
|
||||
# global feature to query quickly
|
||||
dis_vlad = torch.cdist(vlad_query, vlad_valid).view(-1, )
|
||||
dis, idx_cand = torch.topk(dis_vlad, num_cand, largest=False)
|
||||
t_retrieve = time.time() - t0
|
||||
idx_cand = valid_idx[idx_cand]
|
||||
# local feature to qverify
|
||||
fea_query = feas[i]
|
||||
if verify == 'ransac':
|
||||
p1 = kpts[i]
|
||||
p1 = p1.cpu().detach().numpy()
|
||||
min_dis = torch.tensor([9999])
|
||||
idx_detect = idx_cand[0]
|
||||
dis_truth = torch.tensor([9999])
|
||||
for idx_cand1 in idx_cand:
|
||||
fea_cand1 = feas[idx_cand1]
|
||||
p2 = kpts[idx_cand1].cpu().detach().numpy()
|
||||
idx1, idx2, dis = tools.nn_match(fea_query, fea_cand1, 'cosine')
|
||||
if len(idx1) < 31:
|
||||
continue
|
||||
idx1 = idx1.cpu().detach().numpy()
|
||||
idx2 = idx2.cpu().detach().numpy()
|
||||
try:
|
||||
model, inliers = ransac((p1[idx1, 0:2], p2[idx2, 0:2]), model_class=EuclideanTransform, min_samples=15, max_trials=3, residual_threshold=1)
|
||||
num_inlier = np.sum(inliers)
|
||||
# r = model.params[0:2, 0:2]
|
||||
dis_estimate = np.linalg.norm(model.params[0:2, 2])
|
||||
# rot = model.rotation
|
||||
if num_inlier > 30: # ransac存在足够内点
|
||||
if min_dis > dis_estimate:
|
||||
min_dis = dis_estimate
|
||||
idx_detect = idx_cand1
|
||||
dis_truth = torch.linalg.norm((poses[i, 0:3, 3] - poses[idx_detect, 0:3, 3]))
|
||||
except:
|
||||
pass
|
||||
loops.append([i, idx_detect.item(), min_dis.item(), dis_truth.item()])
|
||||
|
||||
else:
|
||||
idx_detect = idx_cand[0]
|
||||
dis_truth = torch.linalg.norm(poses[i, 0:3, 3] - poses[idx_detect, 0:3, 3])
|
||||
loops.append([i, idx_detect.item(), dis[0].item(), dis_truth.item()])
|
||||
t_verify = time.time() - t0 - t_retrieve
|
||||
|
||||
ts.append([t_retrieve, t_verify])
|
||||
# if loops[-1][2] < 4 and loops[-1][1] < i:
|
||||
# loop1.append(loops[-1][1])
|
||||
# x = poses[:, 0, 3]
|
||||
# y = poses[:, 1, 3]
|
||||
# x1 = x[loop1]
|
||||
# y1 = y[loop1]
|
||||
# plt.plot(x, y, 'b.', markersize=1)
|
||||
# plt.plot(x1, y1, 'ro', markersize=2, markerfacecolor='none')
|
||||
# plt.axis('equal')
|
||||
# plt.show()
|
||||
ts = np.array(ts) * 1000
|
||||
# np.savetxt('times.txt', ts)
|
||||
# x=np.arange(len(ts))
|
||||
# plt.plot(x,ts[:,0],'b.')
|
||||
# plt.plot(x,ts[:,1],'r.')
|
||||
# plt.show()
|
||||
loops = np.array(loops)
|
||||
return loops
|
||||
|
||||
|
||||
def pr_curve(poses, loops, sequence, positive_distance=4):
|
||||
|
||||
map_tree_poses = KDTree(poses[:, 0:3, 3])
|
||||
reverse_loops = []
|
||||
real_loop = []
|
||||
for i in range(0,len(poses)):
|
||||
min_range = max(0, i - 50)
|
||||
max_range = min(i + 50, poses.shape[0])
|
||||
current_pose = poses[i]
|
||||
indices = map_tree_poses.query_radius(np.expand_dims(current_pose[0:3, 3], 0), positive_distance)
|
||||
valid_idxs = list(set(indices[0]) - set(range(min_range, max_range)))
|
||||
valid_idxs = np.array(valid_idxs)
|
||||
if len(valid_idxs) > 0:
|
||||
# dis = np.linalg.norm(current_pose[0:3, 3]-poses[valid_idxs,0:3,3],axis=1)
|
||||
real_loop.append(1)
|
||||
r0 = poses[i, :3, :3]
|
||||
rs = poses[valid_idxs, :3, :3]
|
||||
dr = np.linalg.inv(r0) @ rs.swapaxes(0, 2)
|
||||
angle = np.arccos(np.clip((np.trace(dr) - 1) / 2, -1, 1))
|
||||
angle = angle * 180 / np.pi
|
||||
if np.min(angle) > 90:
|
||||
reverse_loops.append(1)
|
||||
else:
|
||||
reverse_loops.append(0)
|
||||
else:
|
||||
real_loop.append(0)
|
||||
reverse_loops.append(0)
|
||||
reverse_loops = np.array(reverse_loops)
|
||||
real_loop = np.array(real_loop)
|
||||
# loops=np.hstack((loops,real_loop.reshape(-1,1)))
|
||||
# np.savetxt('loops_bev%02d.txt'%sequence,loops,fmt='%.6f')
|
||||
# print('sequence %d, %d frames, %d loops, %d reverse loops' % (sequence,len(real_loop), np.sum(real_loop), np.sum(reverse_loops)))
|
||||
# # return 0
|
||||
distances = loops[:, 3]
|
||||
detected_loop = loops[:, 2]
|
||||
precision2 = [1]
|
||||
recall2 = [0]
|
||||
for thr in np.unique(detected_loop):
|
||||
tp = detected_loop <= thr
|
||||
tp = tp & real_loop
|
||||
tp = tp & (distances <= positive_distance)
|
||||
tp = tp.sum()
|
||||
fp = (detected_loop <= thr).sum() - tp
|
||||
fn = (real_loop.sum()) - tp
|
||||
if (tp + fp) > 0.:
|
||||
precision2.append(tp / (tp + fp))
|
||||
else:
|
||||
precision2.append(1.)
|
||||
|
||||
recall2.append(tp / (tp + fn))
|
||||
f1s = []
|
||||
for i in range(len(recall2)):
|
||||
f1s.append((2 * precision2[i] * recall2[i]) / (precision2[i] + recall2[i]))
|
||||
f1 = max(f1s)
|
||||
recall_p1 = np.max(np.array(recall2)[np.array(precision2) == 1])
|
||||
# plt.plot(recall2, precision2, 'b-')
|
||||
# plt.show()
|
||||
pr = np.array(precision2 + recall2).reshape(2, -1).T
|
||||
# np.save('fusion_pr_%02d.npy' % sequence, pr)
|
||||
ap = auc(recall2, precision2)
|
||||
idx=loops[:,2]<9999
|
||||
loops1=loops[idx]
|
||||
rp=np.sum(np.abs(loops1[:,2]-loops1[:,3])<2)/len(loops1)
|
||||
|
||||
print('Sequence %02d, AP %.3f, Recall@100 %.3f, F1 %.3f, RP %.3f/%d' % (sequence, ap, recall_p1, f1, rp, len(loops1)))
|
||||
# if ap<0.1:
|
||||
# exit()
|
||||
|
||||
# --- save PR curve to file (for server usage) ---
|
||||
try:
|
||||
out_dir = os.path.join('/home/adlab36/chenyouyuan/FUSIONLCD', 'result', 'plots')
|
||||
os.makedirs(out_dir, exist_ok=True)
|
||||
plt.figure(figsize=(6, 5))
|
||||
plt.plot(recall2, precision2, 'b-', marker='o', linewidth=2)
|
||||
plt.xlabel('Recall')
|
||||
plt.ylabel('Precision')
|
||||
plt.title(f'Sequence {int(sequence):02d} PR (AP={ap:.3f})')
|
||||
plt.grid(True, linestyle='--', alpha=0.4)
|
||||
fname = os.path.join(out_dir, f'pr_sequence_{int(sequence):02d}.png')
|
||||
plt.tight_layout()
|
||||
plt.savefig(fname, dpi=150)
|
||||
plt.close()
|
||||
except Exception as e:
|
||||
# 如果保存失败也不要阻塞主流程
|
||||
print(f'Warning: failed to save PR plot for sequence {sequence}: {e}')
|
||||
|
||||
return ap, recall_p1, f1
|
||||
|
||||
|
||||
def lcd(data):
|
||||
vlads = data['vlads'].cuda()
|
||||
kpts = data['key_points']
|
||||
sequences = data['sequences']
|
||||
poses = data['pose_query'].cuda()
|
||||
feas = data['fea_kpt'].cuda()
|
||||
# feas = feas / torch.sqrt(torch.sum(feas ** 2, -1, keepdim=True) + 1e-8)
|
||||
result = []
|
||||
recall_at_ks = []
|
||||
recall_at_k=[]
|
||||
for s in torch.unique(sequences):
|
||||
# if s==54:
|
||||
# continue
|
||||
mask = sequences == s
|
||||
|
||||
vlads1 = vlads[mask]
|
||||
feas1 = feas[mask]
|
||||
kpts1 = kpts[mask]
|
||||
poses1 = poses[mask]
|
||||
poses2 = poses1.cpu().detach().numpy()
|
||||
# recall_at_k = recall_with_candidates(vlads1, poses1, s)
|
||||
# idx=np.arange(len(vlads1)//2)
|
||||
# idx=np.tile(idx, 2)
|
||||
# vlads1, feas1, kpts1, poses1 =vlads1[idx], feas1[idx], kpts1[idx], poses1[idx]
|
||||
loops = retrieve(vlads1, feas1, kpts1, poses1, 1, 'ransac')
|
||||
ap, recall_p1, f1 = pr_curve(poses2, loops, s, 4)
|
||||
recall_at_ks.append(recall_at_k)
|
||||
result.append([ap, recall_p1, f1])
|
||||
|
||||
return result, recall_at_ks
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
np.random.seed(123)
|
||||
# data = torch.load('/data4/caodanyang/results/FUSIONLCD/07030/database/database_bev.pth.tar')
|
||||
# lcd(data)
|
||||
print('----------------------------------------------------------------------')
|
||||
data= torch.load('/home/adlab36/chenyouyuan/FUSIONLCD/result/log/database/database_bevp.pth.tar')
|
||||
lcd(data)
|
||||
print('----------------------------------------------------------------------')
|
||||
# data=torch.load('/data4/caodanyang/results/FUSIONLCD/07030/database/database_fusion.pth.tar')
|
||||
# lcd(data)
|
||||
|
||||
191
evaluate_pose.py
Normal file
191
evaluate_pose.py
Normal file
@@ -0,0 +1,191 @@
|
||||
import math
|
||||
import os
|
||||
import time
|
||||
import warnings
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import yaml
|
||||
from skimage.measure import ransac
|
||||
from skimage.transform import EuclideanTransform
|
||||
from tqdm import tqdm
|
||||
|
||||
import net
|
||||
import tools
|
||||
from dataset import KittiTotalLoader
|
||||
|
||||
warnings.filterwarnings("ignore")
|
||||
|
||||
|
||||
def npto_XYZRPY(rotmatrix):
|
||||
'''
|
||||
Usa mathutils per trasformare una matrice di trasformazione omogenea in xyzrpy
|
||||
https://docs.blender.org/api/master/mathutils.html#
|
||||
WARNING: funziona in 32bits quando le variabili numpy sono a 64 bit
|
||||
|
||||
:param rotmatrix: np array
|
||||
:return: np array with the xyzrpy
|
||||
'''
|
||||
|
||||
# qui sotto corrisponde a
|
||||
# quat2eul([ 0.997785 -0.0381564 0.0358964 0.041007 ],'XYZ')
|
||||
roll = math.atan2(-rotmatrix[1, 2], rotmatrix[2, 2])
|
||||
pitch = math.asin(rotmatrix[0, 2])
|
||||
yaw = math.atan2(-rotmatrix[0, 1], rotmatrix[0, 0])
|
||||
x = rotmatrix[:3, 3][0]
|
||||
y = rotmatrix[:3, 3][1]
|
||||
z = rotmatrix[:3, 3][2]
|
||||
|
||||
return np.array([x, y, z, roll, pitch, yaw])
|
||||
|
||||
|
||||
def yt_error(pose1, pose2):
|
||||
distance = np.linalg.norm((pose1 - pose2)[0:3, 3])
|
||||
yaw1 = npto_XYZRPY(pose1)[-1]
|
||||
yaw2 = npto_XYZRPY(pose2)[-1]
|
||||
yaw1 = yaw1 % (2 * np.pi)
|
||||
yaw2 = yaw2 % (2 * np.pi)
|
||||
dyaw = abs(yaw1 - yaw2) % (2 * np.pi)
|
||||
angle = dyaw * 180 / np.pi
|
||||
if angle > 180:
|
||||
angle = 360 - angle
|
||||
return distance, angle
|
||||
|
||||
|
||||
def rt_error(pose1, pose2):
|
||||
r0 = pose1[:3, :3]
|
||||
r1 = pose2[:3, :3]
|
||||
dr = np.linalg.inv(r0) @ r1
|
||||
angle = np.arccos(np.clip((np.trace(dr) - 1) / 2, -1, 1)) * 180 / np.pi
|
||||
distance = np.sqrt(np.sum(np.square(pose1[:3, -1] - pose2[:3, -1])))
|
||||
return distance, angle
|
||||
|
||||
|
||||
def pose_err(vlads=None, dataset=None, positive_distance=4., kpts=None, feas=None, model=None, num_cand=10):
|
||||
poses = dataset.poses
|
||||
error_ransac = []
|
||||
error_uot = []
|
||||
gt = dataset.gt
|
||||
pairs = []
|
||||
num_reverse = 0
|
||||
for i in range(len(gt)):
|
||||
sample = gt[i]
|
||||
idx_query = sample['idx']
|
||||
positive_idxs = sample['positive_idxs']
|
||||
for j in range(len(positive_idxs)):
|
||||
idx_positive = positive_idxs[j]
|
||||
if idx_query < idx_positive:
|
||||
pairs.append([idx_query, idx_positive])
|
||||
_, r = yt_error(poses[idx_query], poses[idx_positive])
|
||||
if r > 90:
|
||||
num_reverse = num_reverse + 1
|
||||
# print('Total %d frames, %d pair of loops, %d[%.3f] reverse'%(len(poses),len(pairs),num_reverse,num_reverse/len(pairs)))
|
||||
cnt_ransca = 0
|
||||
pairs = np.asarray(pairs)
|
||||
idx = np.argsort(pairs[:, 1])
|
||||
pairs = pairs[idx]
|
||||
times_uot=[]
|
||||
times_ransac=[]
|
||||
for i in tqdm(range(len(pairs)),ncols=60):
|
||||
fea_query = feas[pairs[i][0]]
|
||||
p1 = kpts[pairs[i][0]].cpu().detach().numpy()
|
||||
fea_cand = feas[pairs[i][1]]
|
||||
p2 = kpts[pairs[i][1]].cpu().detach().numpy()
|
||||
pose_to_frame = np.matmul(np.linalg.inv(poses[pairs[i][1]]), poses[pairs[i][0]])
|
||||
|
||||
st_ransac=time.time()
|
||||
idx1, idx2, dis = tools.nn_match(fea_query, fea_cand, 'cosine')
|
||||
ransac_flag=False
|
||||
if len(idx1) >= 20:
|
||||
idx1 = idx1.cpu().detach().numpy()
|
||||
idx2 = idx2.cpu().detach().numpy()
|
||||
try:
|
||||
result_ransac, inliers = ransac((p1[idx1, 0:3], p2[idx2, 0:3]), model_class=EuclideanTransform, min_samples=15, max_trials=3, residual_threshold=1.7)
|
||||
num_inlier = np.sum(inliers)
|
||||
if num_inlier > 30: # ransac存在一致的点且无缩
|
||||
cnt_ransca = cnt_ransca + 1
|
||||
error_ransac.append(yt_error(pose_to_frame, result_ransac.params))
|
||||
ransac_flag=True
|
||||
except:
|
||||
pass
|
||||
# if not ransac_flag:
|
||||
# try:
|
||||
# result_ransac, inliers = ransac((p1[idx1, 0:3], p2[idx2, 0:3]), model_class=EuclideanTransform,min_samples=5,max_trials=10,residual_threshold=5)
|
||||
# num_inlier = np.sum(inliers)
|
||||
# cnt_ransca = cnt_ransca + 1
|
||||
# error_ransac.append(yt_error(pose_to_frame, result_ransac.params))
|
||||
# except:
|
||||
# pass
|
||||
times_ransac.append(time.time()-st_ransac)
|
||||
|
||||
fea1 = feas[pairs[i], :, :].to(model.epsilon.device).permute(0, 2, 1)
|
||||
kpts1 = kpts[pairs[i], :, :].to(model.epsilon.device)
|
||||
bd = {'fea_kpt': fea1, 'key_points': kpts1}
|
||||
st_uot=time.time()
|
||||
bd = model(bd)
|
||||
times_uot.append(time.time()-st_uot)
|
||||
pose_estimate1 = bd['transformation'].squeeze(0).cpu().detach().numpy()
|
||||
pose_estimate1 = np.vstack((pose_estimate1, [0, 0, 0, 1]))
|
||||
error_uot.append(yt_error(pose_to_frame, pose_estimate1))
|
||||
|
||||
ransac_rate = cnt_ransca / len(pairs)
|
||||
error_ransac = np.asarray(error_ransac)
|
||||
error_uot = np.asarray(error_uot)
|
||||
# np.save('error_ransac_%02d.npy' % dataset.sequence, error_ransac)
|
||||
# np.save('error_uot_%02d.npy' % dataset.sequence, error_uot)
|
||||
et_ransac = np.mean(error_ransac[:, 0])
|
||||
er_ransac = np.mean(error_ransac[:, 1])
|
||||
et_uot = np.mean(error_uot[:, 0])
|
||||
er_uot = np.mean(error_uot[:, 1])
|
||||
print('uot time: ',np.array(times_uot).mean())
|
||||
print('ransac time: ',np.array(times_ransac).mean())
|
||||
return et_ransac, er_ransac, et_uot, er_uot, ransac_rate
|
||||
|
||||
|
||||
def estimate_pose(database):
|
||||
try:
|
||||
with open(os.path.join(os.getcwd(), "config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "config.yaml"))
|
||||
except:
|
||||
with open(os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"))
|
||||
cfg = cfg['experiment']
|
||||
path_result = cfg['path_result']
|
||||
_, _, loader_test = KittiTotalLoader(cfg)
|
||||
model = net.Fusion(cfg)
|
||||
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
model = model.to(device)
|
||||
checkpoint = torch.load(tools.path_join(path_result, 'models', cfg['last_model']))
|
||||
model.load_state_dict(checkpoint['model'])
|
||||
uot = model.uot
|
||||
# uot = model.bev.uot
|
||||
|
||||
vlads = database['vlads']
|
||||
key_points = database['key_points']
|
||||
fea_kpt = database['fea_kpt']
|
||||
sequences = database['sequences']
|
||||
print()
|
||||
print('****************************************************************************')
|
||||
end = 0
|
||||
for i in range(len(loader_test.dataset.datasets)):
|
||||
start = end
|
||||
end = start + len(loader_test.dataset.datasets[i])
|
||||
# end=0
|
||||
# start=0
|
||||
et_ransac, er_ransac, et_uot, er_uot, rate = pose_err(vlads=vlads[start:end], dataset=loader_test.dataset.datasets[i],
|
||||
kpts=key_points[start:end], feas=fea_kpt[start:end], model=uot)
|
||||
print('Sequence %02d' % (torch.unique(sequences)[i]))
|
||||
print('ransac rate:%.4f, translation error:%.4f[m], rototion error:%.4f[deg]' % (rate, float(et_ransac), float(er_ransac)))
|
||||
print('uot translation error:%.4f[m], uot rototion error:%.4f[deg],' % (float(et_uot), float(er_uot)))
|
||||
print('****************************************************************************')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# CUDA_VISIBLE_DEVICES=5 python evaluate_pose.py
|
||||
# CUDA_VISIBLE_DEVICES=2 nohup python -u evaluate_pose.py >03090.log 2>&1 &
|
||||
# fuser /dev/nvidia*
|
||||
database = torch.load('/data4/caodanyang/results/FUSIONLCD/07030/database/database_fusion.pth.tar')
|
||||
estimate_pose(database)
|
||||
130
loop_gt.py
Normal file
130
loop_gt.py
Normal file
@@ -0,0 +1,130 @@
|
||||
import argparse
|
||||
import torch
|
||||
from torch.utils.data import Dataset
|
||||
import pykitti
|
||||
import os
|
||||
from sklearn.neighbors import KDTree
|
||||
import pickle
|
||||
import numpy as np
|
||||
skip_frame=50
|
||||
|
||||
class KITTILoader3DPosesOnlyLoopPositives(Dataset):
|
||||
|
||||
def __init__(self, dir, sequence, poses, positive_range=5., negative_range=25., hard_range=None):
|
||||
super(KITTILoader3DPosesOnlyLoopPositives, self).__init__()
|
||||
|
||||
self.positive_range = positive_range
|
||||
self.negative_range = negative_range
|
||||
self.hard_range = hard_range
|
||||
self.dir = dir
|
||||
self.sequence = sequence
|
||||
|
||||
if int(sequence) > 21:
|
||||
self.poses = np.load(poses)
|
||||
else:
|
||||
calib = np.genfromtxt(os.path.join(dir, 'sequences', sequence, 'calib.txt'))[:, 1:]
|
||||
T_cam_velo = np.reshape(calib[4], (3, 4))
|
||||
T_cam_velo = np.vstack([T_cam_velo, [0, 0, 0, 1]])
|
||||
poses2 = []
|
||||
with open(poses, 'r') as f:
|
||||
for x in f:
|
||||
x = x.strip().split()
|
||||
x = [float(v) for v in x]
|
||||
pose = np.zeros((4, 4))
|
||||
pose[0, 0:4] = np.array(x[0:4])
|
||||
pose[1, 0:4] = np.array(x[4:8])
|
||||
pose[2, 0:4] = np.array(x[8:12])
|
||||
pose[3, 3] = 1.0
|
||||
pose = np.linalg.inv(T_cam_velo) @ (pose @ T_cam_velo)
|
||||
poses2.append(pose)
|
||||
self.poses = np.stack(poses2)
|
||||
self.kdtree = KDTree(self.poses[:, :3, 3])
|
||||
|
||||
def __len__(self):
|
||||
return len(self.poses)
|
||||
|
||||
def __getitem__(self, idx):
|
||||
|
||||
x = self.poses[idx, 0, 3]
|
||||
y = self.poses[idx, 1, 3]
|
||||
z = self.poses[idx, 2, 3]
|
||||
r0 = self.poses[idx, :3, :3]
|
||||
rs = self.poses[:, :3, :3]
|
||||
dr = np.linalg.inv(r0) @ rs.swapaxes(0, 2)
|
||||
angle = np.arccos(np.clip((np.trace(dr) - 1) / 2, -1, 1))
|
||||
angle = angle * 180 / np.pi
|
||||
|
||||
idx_angle = np.where(angle < 99999)[0]
|
||||
anchor_pose = torch.tensor([x, y, z])
|
||||
|
||||
indices = self.kdtree.query_radius(anchor_pose.unsqueeze(0).numpy(), self.positive_range, sort_results=True, return_distance=True)
|
||||
indices = [indices[0][0], indices[1][0]]
|
||||
min_range = max(0, idx - skip_frame)
|
||||
max_range = min(idx + skip_frame, len(self.poses))
|
||||
positive_idxs = list(set(indices[0]) & set(idx_angle) - set(range(min_range, max_range)))
|
||||
loop_angle = angle[positive_idxs]
|
||||
reverse = 0
|
||||
if len(loop_angle) > 0:
|
||||
reverse=np.sum(loop_angle>90)
|
||||
if min(loop_angle) > 90:
|
||||
reverse = -1*reverse
|
||||
positive_idxs.sort()
|
||||
num_loop = len(positive_idxs)
|
||||
|
||||
indices = self.kdtree.query_radius(anchor_pose.unsqueeze(0).numpy(), self.negative_range)
|
||||
indices = set(indices[0])
|
||||
negative_idxs = set(range(len(self.poses))) - indices
|
||||
negative_idxs = list(negative_idxs)
|
||||
negative_idxs.sort()
|
||||
|
||||
hard_idxs = None
|
||||
if self.hard_range is not None:
|
||||
inner_indices = self.kdtree.query_radius(anchor_pose.unsqueeze(0).numpy(), self.hard_range[0])
|
||||
outer_indices = self.kdtree.query_radius(anchor_pose.unsqueeze(0).numpy(), self.hard_range[1])
|
||||
hard_idxs = set(outer_indices[0]) - set(inner_indices[0])
|
||||
pass
|
||||
|
||||
return num_loop, positive_idxs, negative_idxs, hard_idxs, reverse
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--root_folder', default='/home/adlab36/chenyouyuan/FUSIONLCD',
|
||||
help='dataset directory')
|
||||
args = parser.parse_args()
|
||||
base_dir = args.root_folder
|
||||
for sequence in ['00', '05', '06', '07', '08', '09', '50', '54', '55', '56', '59']:#'00', '05', '06', '07', '08', '09', '50', '54', '55', '56', '59', '120205','130405'
|
||||
if int(sequence) < 50:
|
||||
poses_file = base_dir + "/sequences/" + sequence + "/poses.txt"
|
||||
elif int(sequence)<100:
|
||||
poses_file = base_dir + "/sequences/" + sequence + "/poses.npy"
|
||||
else:
|
||||
pass
|
||||
|
||||
|
||||
dataset = KITTILoader3DPosesOnlyLoopPositives(base_dir, sequence, poses_file, 4, 15, [8, 15])
|
||||
lc_gt = []
|
||||
lc_gt_file = os.path.join(base_dir, 'sequences', sequence, 'loop_GT_4m.pickle')
|
||||
loop_pairs = []
|
||||
loop_files = []
|
||||
for i in range(len(dataset)):
|
||||
sample, pos, neg, hard, reverse = dataset[i]
|
||||
if sample > 0.:
|
||||
loop_files.append([i, reverse])
|
||||
sample_dict = {}
|
||||
sample_dict['idx'] = i
|
||||
sample_dict['positive_idxs'] = pos
|
||||
for p in pos:
|
||||
if i < p:
|
||||
loop_pairs.append([i, p])
|
||||
# sample_dict['negative_idxs'] = neg
|
||||
# sample_dict['hard_idxs'] = hard
|
||||
lc_gt.append(sample_dict)
|
||||
loop_files = np.array(loop_files)
|
||||
num_reverse_file = int(np.sum(loop_files[:, 1]<0))
|
||||
num_reverse_pairs = int(np.sum(np.abs(loop_files[:, 1])))/2
|
||||
with open(lc_gt_file, 'wb') as f:
|
||||
pickle.dump(lc_gt, f)
|
||||
print('Sequence %02d done,%05d files, %05d files with loop, %05d[%.4f] files only has reverse loop, %05d loop pairs, %05d[%.4f] reverse loop' %
|
||||
(int(sequence), len(dataset), len(loop_files), num_reverse_file, num_reverse_file / len(loop_files), len(loop_pairs),num_reverse_pairs,num_reverse_pairs/len(loop_pairs)))
|
||||
340
loss.py
Normal file
340
loss.py
Normal file
@@ -0,0 +1,340 @@
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
from pytorch_metric_learning import distances
|
||||
|
||||
|
||||
def tr_loss(batch_dict,key):
|
||||
loss1 = (batch_dict[key][:, 0:3, 3] -
|
||||
batch_dict['pose_to_frame'][:, 0:3, 3]).norm(dim=1).mean()
|
||||
loss2 = (torch.acos(torch.clip(batch_dict[key][:, 0, 0].view(-1, 1), -1, 1)) -
|
||||
torch.acos(torch.clip(batch_dict['pose_to_frame'][:, 0, 0].view(-1, 1), -1, 1))).norm(dim=1).mean() / 3.1415 * 180
|
||||
return loss1, loss2
|
||||
|
||||
|
||||
def gen_points_loss(batch_dict):
|
||||
key_points_gen = batch_dict['key_points_gen']
|
||||
key_points = batch_dict['key_points']
|
||||
key_points_gen1 = torch.cat((key_points_gen, key_points_gen * 0), dim=2)
|
||||
key_points_gen1[:, :, 3] = 1
|
||||
# pose_query=batch_dict['pose_query']
|
||||
# pose_positive=batch_dict['pose_positive']
|
||||
# poses=torch.cat((pose_query,pose_positive),dim=0)
|
||||
# key_points_gen2=torch.bmm(poses,key_points_gen1.permute(0,2,1)).permute(0,2,1)
|
||||
# key_points2=torch.bmm(poses,key_points.permute(0,2,1)).permute(0,2,1)
|
||||
# loss_gpo=(key_points_gen2[:,:,:2]-key_points2[:,:,:2]).norm(p=1,dim=2).mean()
|
||||
pose_to_frame = batch_dict['pose_to_frame']
|
||||
B = pose_to_frame.shape[0]
|
||||
src_pts = key_points[:B]
|
||||
tgt_pts = key_points[B:]
|
||||
src_pts_gen = key_points_gen1[:B]
|
||||
tgt_pts_gen = key_points_gen1[B:]
|
||||
srcs = torch.cat((src_pts, src_pts_gen), dim=0)
|
||||
tgts = torch.cat((tgt_pts_gen, tgt_pts), dim=0)
|
||||
pose_to_frame1 = torch.cat((pose_to_frame, pose_to_frame), dim=0)
|
||||
srcs1 = torch.bmm(pose_to_frame1, srcs.permute(0, 2, 1)).permute(0, 2, 1)
|
||||
loss = torch.mean(torch.abs(srcs1[:, :, :2] - tgts[:, :, :2]))
|
||||
return loss
|
||||
|
||||
|
||||
def rand_dis(x, y):
|
||||
assert len(x.shape)==2 and len(y.shape)==2,'x and y must be 2 dim'
|
||||
N, N = x.size()
|
||||
ids=torch.arange(N).to(x.device)
|
||||
idx = ids.view(1, N).repeat(N, 1)
|
||||
mask = ~(idx == idx.transpose(0, 1))
|
||||
idx1 = idx[mask].view(N, N - 1)
|
||||
random_indices = torch.randint(N - 1, size=(N,)).to(x.device)
|
||||
rand_idx = torch.gather(idx1, 1, random_indices.view(-1, 1))
|
||||
rand_idx1 = torch.cat([ids.view(N, 1), rand_idx], dim=1)
|
||||
diag = ids.view(N, 1).repeat(1, 2)
|
||||
x1 = x[rand_idx1[:, 0], rand_idx1[:, 1]]
|
||||
x2 = x1*0
|
||||
x3 = torch.cat([x1.view(N, 1), x2.view(N, 1)], dim=1)
|
||||
y1 = y[rand_idx1[:, 0], rand_idx1[:, 1]]
|
||||
y2 = y[diag[:, 0], diag[:, 1]]
|
||||
y3 = torch.cat([y1.view(N, 1), y2.view(N, 1)], dim=1)
|
||||
dis=torch.abs(x3-y3).mean()+F.relu(0.2-torch.abs(y1)).mean()
|
||||
return dis
|
||||
|
||||
|
||||
def gen_feature_loss(batch_dict):
|
||||
#BCN
|
||||
fea_pt_dual_gen = batch_dict['fea_pt_dual_gen']
|
||||
fea_pl_dual_gen = batch_dict['fea_pl_dual_gen']
|
||||
fea_kpt_original_gen = batch_dict['fea_kpt_original_gen']
|
||||
# fea_kpt_gen_gen=batch_dict['fea_kpt_gen_gen']
|
||||
fea_pt_dual = batch_dict['fea_pt_dual']
|
||||
fea_pl_dual = batch_dict['fea_pl_dual']
|
||||
fea_kpt_original = batch_dict['fea_kpt_original']
|
||||
# fea_pt_dual = batch_dict['fea_pt_dual'].detach()
|
||||
# fea_pl_dual = batch_dict['fea_pl_dual'].detach()
|
||||
# fea_kpt_original = batch_dict['fea_kpt_original'].detach()
|
||||
|
||||
b = fea_pl_dual.shape[0]
|
||||
loss0 = 0
|
||||
loss1 = 0
|
||||
loss2 = 0
|
||||
loss3 = 0
|
||||
relation = batch_dict['relation']
|
||||
nums=0
|
||||
for i in range(b):
|
||||
cnt = torch.sum((relation[i, :, -1, 0] > 0) & (relation[i, :, -1, 1] > 0))
|
||||
nums+=cnt
|
||||
fea_pt_dual1 = fea_pt_dual[i, :, :cnt] # 匹配点云特征,CN
|
||||
fea_pt_dual_gen1 = fea_pt_dual_gen[i, :, :cnt] # 匹配点云特征,生成于图像
|
||||
fea_pl_dual1 = fea_pl_dual[i, :, :cnt] # 匹配图像特征
|
||||
fea_pl_dual_gen1 = fea_pl_dual_gen[i, :, :cnt] # 匹配图像特征,生成于点云
|
||||
|
||||
# loss0 = loss0 + torch.abs(fea_pt_dual1 - fea_pt_dual_gen1).mean()
|
||||
loss0 = loss0 + (1 - F.cosine_similarity(fea_pt_dual1,fea_pt_dual_gen1,dim=0)).mean()
|
||||
# loss0 = loss0 + F.mse_loss(fea_pt_dual1, fea_pt_dual_gen1)
|
||||
# loss0 = loss0 + ((fea_pt_dual1 - fea_pt_dual_gen1).norm(p=2, dim=0)).mean()
|
||||
# sims00=tools.batch_distance(fea_pt_dual1.unsqueeze(0).permute(0,2,1),fea_pt_dual1.unsqueeze(0).permute(0,2,1),'cosine')
|
||||
# sims01=tools.batch_distance(fea_pt_dual_gen1.unsqueeze(0).permute(0,2,1),fea_pt_dual_gen1.unsqueeze(0).permute(0,2,1),'cosine')
|
||||
# loss0 = loss0 + torch.abs(sims00-sims01).mean()
|
||||
|
||||
# loss1 = loss1 + torch.abs(fea_pl_dual1 - fea_pl_dual_gen1).mean()
|
||||
loss1 = loss1 + (1 - F.cosine_similarity(fea_pl_dual1,fea_pl_dual_gen1,dim=0)).mean()
|
||||
# loss1 = loss1 + F.mse_loss(fea_pl_dual1, fea_pl_dual_gen1)
|
||||
# loss1 = loss1 + ((fea_pl_dual1 - fea_pl_dual_gen1).norm(p=2, dim=0)).mean()
|
||||
# sims10=tools.batch_distance(fea_pl_dual1.unsqueeze(0).permute(0,2,1),fea_pl_dual1.unsqueeze(0).permute(0,2,1),'cosine')
|
||||
# sims11=tools.batch_distance(fea_pl_dual_gen1.unsqueeze(0).permute(0,2,1),fea_pl_dual_gen1.unsqueeze(0).permute(0,2,1),'cosine')
|
||||
# loss1 = loss1 + torch.abs(sims10-sims11).mean()
|
||||
|
||||
#全景特征生成模块损失计算
|
||||
# loss2 = loss2 + torch.abs(fea_kpt_original[i] - fea_kpt_original_gen[i]).mean()
|
||||
loss2= loss2 + (1-F.cosine_similarity(fea_kpt_original[i], fea_kpt_original_gen[i],dim=0)).mean()
|
||||
# loss2 = loss2 + F.mse_loss(fea_kpt_original[i], fea_kpt_original_gen[i])
|
||||
# loss2 = loss2 + ((fea_kpt_original[i] - fea_kpt_original_gen[i]).norm(p=2, dim=0)).mean()
|
||||
# sims20=tools.batch_distance(fea_kpt_original[i:i+1].permute(0,2,1),fea_kpt_original[i:i+1].permute(0,2,1),'cosine')
|
||||
# sims21=tools.batch_distance(fea_kpt_original_gen[i:i+1].permute(0,2,1),fea_kpt_original_gen[i:i+1].permute(0,2,1),'cosine')
|
||||
# loss2 = loss2 + torch.abs(sims20-sims21).mean()
|
||||
loss0 = loss0 / b
|
||||
loss1 = loss1 / b
|
||||
loss2 = loss2 / b
|
||||
return loss0, loss1, loss2, loss3
|
||||
|
||||
|
||||
def sinkhorn_matches_loss(batch_dict,key):
|
||||
project_kpts = batch_dict[key] # calculated from corrspondence of kpts
|
||||
src_coords = batch_dict['key_points']
|
||||
pose_to_frame = batch_dict['pose_to_frame']
|
||||
src_coords = src_coords.clone().view(batch_dict['batch_size'], -1, 4)
|
||||
B, N_POINT, _ = src_coords.shape
|
||||
B = B // 2
|
||||
src_coords = src_coords[:B, :, [0, 1, 2, 3]]
|
||||
src_coords[:, :, -1] = 1.
|
||||
gt_dst_coords = torch.bmm(pose_to_frame, src_coords.permute(0, 2, 1)) # True project kpts
|
||||
gt_dst_coords = gt_dst_coords.permute(0, 2, 1)[:, :, :3]
|
||||
loss = (gt_dst_coords - project_kpts).norm(dim=2).mean()
|
||||
return loss
|
||||
|
||||
|
||||
|
||||
def score_loss(batch_dict):
|
||||
score = batch_dict['score_bev']
|
||||
label_score = batch_dict['label_score']
|
||||
label_score = torch.cat([label_score[:, :, :, 0], label_score[:, :, :, 1]], dim=0)
|
||||
mask1 = score > 1e-8
|
||||
# mask2 = label_score > 1e-8
|
||||
# mask = mask1 | mask2
|
||||
score = score[mask1]
|
||||
label_score = label_score[mask1]
|
||||
loss = nn.functional.mse_loss(score, label_score)
|
||||
|
||||
return loss
|
||||
|
||||
|
||||
def pose_loss(batch_dict,key):
|
||||
src_coords = batch_dict['key_points']
|
||||
src_coords = src_coords.clone().view(batch_dict['batch_size'], -1, 4)
|
||||
delta_pose = batch_dict['pose_to_frame']
|
||||
B, N_POINT, _ = src_coords.shape
|
||||
B = B // 2
|
||||
src_coords = src_coords[:B]
|
||||
gt_dst_coords = torch.bmm(delta_pose, src_coords.permute(0, 2, 1)).float()
|
||||
gt_dst_coords = gt_dst_coords.permute(0, 2, 1)[:, :, :3]
|
||||
|
||||
transformation = batch_dict[key]
|
||||
pred_dst_coords = torch.bmm(transformation, src_coords.permute(0, 2, 1))
|
||||
pred_dst_coords = pred_dst_coords.permute(0, 2, 1)[:, :, :3]
|
||||
loss = torch.mean(torch.abs(pred_dst_coords - gt_dst_coords))
|
||||
return loss
|
||||
|
||||
|
||||
def get_all_triplets(dist_mat, pos_mask, neg_mask, is_inverted=False, margin=0.5, different_embedding=False):
|
||||
if not different_embedding:
|
||||
pos_mask = torch.triu(pos_mask, 1)
|
||||
triplets = pos_mask.unsqueeze(2) * neg_mask.unsqueeze(1)
|
||||
return torch.where(triplets)
|
||||
|
||||
|
||||
def hardest_negative_selector(dist_mat, pos_mask, neg_mask, is_inverted, margin=0.5, different_embedding=False):
|
||||
if not different_embedding:
|
||||
pos_mask = torch.triu(pos_mask, 1)
|
||||
a, p = torch.where(pos_mask)
|
||||
if neg_mask.sum() == 0:
|
||||
return a, p, None
|
||||
if is_inverted:
|
||||
dist_neg = dist_mat * neg_mask
|
||||
n = torch.max(dist_neg, dim=1)
|
||||
else:
|
||||
dist_neg = dist_mat.clone()
|
||||
dist_neg[~neg_mask] = dist_neg.max() + 1.
|
||||
_, n = torch.min(dist_neg, dim=1)
|
||||
n = n[a]
|
||||
return a, p, n
|
||||
|
||||
|
||||
def random_negative_selector(dist_mat, pos_mask, neg_mask, is_inverted, margin=0.5, different_embedding=False):
|
||||
if not different_embedding:
|
||||
pos_mask = torch.triu(pos_mask, 1)
|
||||
a, p = torch.where(pos_mask)
|
||||
selected_negs = []
|
||||
for i in range(a.shape[0]):
|
||||
possible_negs = torch.where(neg_mask[a[i]])[0]
|
||||
if len(possible_negs) == 0:
|
||||
return a, p, None
|
||||
|
||||
dist_neg = dist_mat[a[i], possible_negs]
|
||||
if is_inverted:
|
||||
curr_loss = -dist_mat[a[i], p[i]] + dist_neg + margin
|
||||
else:
|
||||
curr_loss = dist_mat[a[i], p[i]] - dist_neg + margin
|
||||
|
||||
if len(possible_negs[curr_loss > 0]) > 0:
|
||||
possible_negs = possible_negs[curr_loss > 0]
|
||||
random_neg = np.random.choice(possible_negs.cpu().numpy())
|
||||
selected_negs.append(random_neg)
|
||||
n = torch.tensor(selected_negs, dtype=a.dtype, device=a.device)
|
||||
return a, p, n
|
||||
|
||||
|
||||
def semihard_negative_selector(dist_mat, pos_mask, neg_mask, is_inverted, margin=0.5, different_embedding=False):
|
||||
if not different_embedding:
|
||||
pos_mask = torch.triu(pos_mask, 1)
|
||||
a, p = torch.where(pos_mask)
|
||||
selected_negs = []
|
||||
for i in range(a.shape[0]):
|
||||
possible_negs = torch.where(neg_mask[a[i]])[0]
|
||||
if len(possible_negs) == 0:
|
||||
return a, p, None
|
||||
|
||||
dist_neg = dist_mat[a[i], possible_negs]
|
||||
if is_inverted:
|
||||
curr_loss = -dist_mat[a[i], p[i]] + dist_neg + margin
|
||||
else:
|
||||
curr_loss = dist_mat[a[i], p[i]] - dist_neg + margin
|
||||
|
||||
semihard_idxs = (curr_loss > 0) & (curr_loss < margin)
|
||||
if len(possible_negs[semihard_idxs]) > 0:
|
||||
possible_negs = possible_negs[semihard_idxs]
|
||||
random_neg = np.random.choice(possible_negs.cpu().numpy())
|
||||
selected_negs.append(random_neg)
|
||||
n = torch.tensor(selected_negs, dtype=a.dtype, device=a.device)
|
||||
return a, p, n
|
||||
|
||||
|
||||
class TripletLoss(nn.Module):
|
||||
def __init__(self, margin: float, triplet_selector, distance: distances.BaseDistance):
|
||||
super(TripletLoss, self).__init__()
|
||||
self.margin = margin
|
||||
self.triplet_selector = triplet_selector
|
||||
self.distance = distance
|
||||
|
||||
def forward(self, embeddings, pos_mask, neg_mask, other_embeddings=None):
|
||||
if other_embeddings is None:
|
||||
other_embeddings = embeddings
|
||||
dist_mat = self.distance(embeddings, other_embeddings)
|
||||
triplets = self.triplet_selector(
|
||||
dist_mat, pos_mask, neg_mask, self.distance.is_inverted)
|
||||
distance_positive = dist_mat[triplets[0], triplets[1]]
|
||||
if triplets[-1] is None:
|
||||
if self.distance.is_inverted:
|
||||
return F.relu(1 - distance_positive).mean()
|
||||
else:
|
||||
return F.relu(distance_positive).mean()
|
||||
distance_negative = dist_mat[triplets[0], triplets[2]]
|
||||
curr_margin = self.distance.margin(
|
||||
distance_positive, distance_negative)
|
||||
loss = F.relu(curr_margin + self.margin)
|
||||
return loss.mean()
|
||||
|
||||
|
||||
def _pairwise_distance(x, squared=False, eps=1e-16):
|
||||
# Compute the 2D matrix of distances between all the embeddings.
|
||||
|
||||
cor_mat = torch.matmul(x, x.t())
|
||||
norm_mat = cor_mat.diag()
|
||||
distances = norm_mat.unsqueeze(1) - 2 * cor_mat + norm_mat.unsqueeze(0)
|
||||
distances = F.relu(distances)
|
||||
|
||||
if not squared:
|
||||
mask = torch.eq(distances, 0.0).float()
|
||||
distances = distances + mask * eps
|
||||
distances = torch.sqrt(distances)
|
||||
distances = distances * (1.0 - mask)
|
||||
|
||||
return distances
|
||||
|
||||
|
||||
class TotalLoss(nn.Module):
|
||||
def __init__(self, cfg):
|
||||
super(TotalLoss, self).__init__()
|
||||
if 'hardest' == cfg['negetative_selsector']:
|
||||
neg_selector = hardest_negative_selector
|
||||
elif 'semihard' == cfg['negetative_selsector']:
|
||||
neg_selector = semihard_negative_selector
|
||||
else:
|
||||
neg_selector = random_negative_selector
|
||||
self.trip_fun = TripletLoss(margin=cfg['trip_margin'], triplet_selector=neg_selector, distance=distances.LpDistance())
|
||||
self.negetative_distcance = 50
|
||||
|
||||
def forward(self, batch_dict):
|
||||
l_pose=l_score=l_match=l_tra=l_rot=l_gb=l_gi=l_gpa=l_gpo=l_kpl = 0
|
||||
if 'key_points' in batch_dict.keys():
|
||||
l_score = score_loss(batch_dict)
|
||||
l_match1,l_pose1,l_match2,l_pose2,l_tra1,l_rot1,l_tra2,l_rot2=0,0,0,0,0,0,0,0
|
||||
|
||||
if 'transformation_original' in batch_dict.keys():
|
||||
l_match1 = sinkhorn_matches_loss(batch_dict,'project_kpts_original')
|
||||
l_tra1, l_rot1 = tr_loss(batch_dict,'transformation_original')
|
||||
l_pose1 = pose_loss(batch_dict,'transformation_original')
|
||||
if 'transformation_fusion' in batch_dict.keys():
|
||||
l_match2 = sinkhorn_matches_loss(batch_dict,'project_kpts_fusion')
|
||||
l_tra2, l_rot2 = tr_loss(batch_dict,'transformation_fusion')
|
||||
l_pose2 = pose_loss(batch_dict,'transformation_fusion')
|
||||
cnt=1
|
||||
if min(l_rot1,l_rot2)>0:
|
||||
cnt=2
|
||||
l_match=(l_match1+l_match2)/cnt
|
||||
l_pose=(l_pose1+l_pose2)/cnt
|
||||
l_tra=(l_tra1+l_tra2)/cnt
|
||||
l_rot=(l_rot1+l_rot2)/cnt
|
||||
if ('fea_pt_dual_gen' in batch_dict.keys()) or ('fea_pl_dual_gen' in batch_dict.keys()):
|
||||
l_gb, l_gi, l_gpa,l_kpl = gen_feature_loss(batch_dict)
|
||||
if 'key_points_gen' in batch_dict.keys():
|
||||
l_gpo = gen_points_loss(batch_dict)
|
||||
if 'sequence' in batch_dict:
|
||||
neg_mask = batch_dict['sequence'].view(1, -1) != batch_dict['sequence'].view(-1, 1)
|
||||
else:
|
||||
neg_mask = torch.zeros((batch_dict['pose_query'].shape[0] * 2, batch_dict['pose_query'].shape[0] * 2), dtype=torch.bool)
|
||||
pair_dist = _pairwise_distance(batch_dict['pose_query'][:, 0:3, 3])
|
||||
neg_mask = ((pair_dist > self.negetative_distcance) | neg_mask.to(pair_dist.device))
|
||||
neg_mask = neg_mask.repeat(2, 2)
|
||||
batch_size = batch_dict['batch_size']
|
||||
pos_mask = torch.zeros((batch_size, batch_size), device=neg_mask.device)
|
||||
|
||||
for i in range(batch_size // 2):
|
||||
pos_mask[i, i + batch_size // 2] = 1
|
||||
pos_mask[i + batch_size // 2, i] = 1
|
||||
l_triplet = self.trip_fun(batch_dict['vlads'], pos_mask, neg_mask)
|
||||
l_total = l_score + l_pose + 0.05 * l_match + l_triplet + (l_gb + l_gi + l_gpa + l_kpl)
|
||||
loss = [l_total, l_pose, l_score, l_match, l_triplet, l_tra, l_rot, l_gb, l_gi, l_gpa, l_gpo,l_kpl]
|
||||
for i in range(len(loss)):
|
||||
if loss[i]==0:
|
||||
loss[i]=loss[0]*0
|
||||
batch_dict['loss']=loss
|
||||
return batch_dict
|
||||
BIN
models/checkpoint_bev.pth.tar
Normal file
BIN
models/checkpoint_bev.pth.tar
Normal file
Binary file not shown.
BIN
models/checkpoint_bevp.pth.tar
Normal file
BIN
models/checkpoint_bevp.pth.tar
Normal file
Binary file not shown.
BIN
models/checkpoint_fusion.pth.tar
Normal file
BIN
models/checkpoint_fusion.pth.tar
Normal file
Binary file not shown.
628
net.py
Normal file
628
net.py
Normal file
@@ -0,0 +1,628 @@
|
||||
import cv2
|
||||
import math
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch._utils
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
from uot import UOTHead
|
||||
from netvlad import NetVLAD, NetVLADLoupe
|
||||
from ALIKE.alike import configs
|
||||
from ALIKE.alnet import ALNet
|
||||
from BEVNet import RICNN, EncodePosition, RIAvgpool2d, RIMaxpool2d
|
||||
import tools
|
||||
|
||||
|
||||
def simple_nms(scores, nms_radius=2, itertation=2, mode='1'):
|
||||
""" Fast Non-maximum suppression to remove nearby points """
|
||||
assert (nms_radius >= 0)
|
||||
if mode == 'ri':
|
||||
max_pool = RIMaxpool2d(kernel_size=nms_radius * 2 + 1, stride=1, padding=nms_radius)
|
||||
else:
|
||||
max_pool = nn.MaxPool2d(kernel_size=nms_radius * 2 + 1, stride=1, padding=nms_radius)
|
||||
|
||||
zeros = torch.zeros_like(scores)
|
||||
max_mask = scores == max_pool(scores)
|
||||
|
||||
for _ in range(itertation):
|
||||
supp_mask = max_pool(max_mask.float()) > 0
|
||||
supp_scores = torch.where(supp_mask, zeros, scores)
|
||||
new_max_mask = supp_scores == max_pool(supp_scores)
|
||||
max_mask = max_mask | (new_max_mask & (~supp_mask))
|
||||
return torch.where(max_mask, scores, zeros)
|
||||
|
||||
|
||||
class BEVHead(nn.Module):
|
||||
def __init__(self, alnet='alike-n', iter=5, num_kpt=100, cluster_num=16, vlad_size=256):
|
||||
super(BEVHead, self).__init__()
|
||||
cfg = configs[alnet]
|
||||
self.feature_extractor = ALNet(c1=cfg['c1'], c2=cfg['c2'], c3=cfg['c3'], c4=cfg['c4'], dim=cfg['dim'],
|
||||
single_head=cfg['single_head'])
|
||||
self.feature_size = int(self.feature_extractor.feature_size)
|
||||
self.select = 'maxpool'
|
||||
self.num_kpt = num_kpt
|
||||
self.ep = EncodePosition(feature_size=self.feature_size)
|
||||
self.uot = UOTHead(nb_iter=iter,name='original')
|
||||
self.netvlad_bev = NetVLAD(self.feature_size, cluster_num)
|
||||
# state_dict=torch.load('/data4/caodanyang/results/FUSIONLCD/bev_07250/models/checkpoint_049.pth.tar', map_location='cpu')['model']
|
||||
# state_dict_new={}
|
||||
# for k,v in state_dict.items():
|
||||
# state_dict_new[k[4:]]=v
|
||||
# self.load_state_dict(state_dict_new)
|
||||
# for param in self.parameters():
|
||||
# param.requires_grad = False
|
||||
|
||||
def forward(self, batch_dict):
|
||||
assert type(batch_dict) is dict, 'Input should be a dict'
|
||||
bev = batch_dict['bev']
|
||||
guider = (bev[:, 2:3] > 0).float()
|
||||
b, c, h_bev, w_bev = bev.shape
|
||||
x = bev[:, 0:3, :, :]
|
||||
points = bev[:, 3:7, :, :] # xyzi
|
||||
points[:, 2] = 0
|
||||
points[:, 3] = 1
|
||||
|
||||
score_bev, feature_bev = self.feature_extractor(x)
|
||||
score_bev = score_bev * guider
|
||||
|
||||
if self.select == 'avgpool':
|
||||
avgpool = RIAvgpool2d(kernel_size=5, stride=4, padding=1)
|
||||
grid = np.array(np.meshgrid(np.arange(h_bev), np.arange(w_bev))).swapaxes(0, 2)
|
||||
grid = torch.from_numpy(grid).to(x.device).permute(2, 0, 1).unsqueeze(0).repeat(b, 1, 1, 1)
|
||||
score_bev_avg = avgpool(score_bev)
|
||||
grid_avg = avgpool(grid.float() * score_bev) / (score_bev_avg + 1e-8)
|
||||
grid_avg = torch.round(grid_avg).long().permute(0, 2, 3, 1)
|
||||
points_avg = avgpool(score_bev * points) / (score_bev_avg + 1e-8)
|
||||
feature_bev_avg = avgpool(feature_bev * score_bev) / (score_bev_avg + 1e-8)
|
||||
score_bev = score_bev.view(b, h_bev, w_bev)
|
||||
score_bev_avg = score_bev_avg.squeeze(1)
|
||||
kpts = []
|
||||
feas_kpt = []
|
||||
pixels_kpt = []
|
||||
# cnt=0
|
||||
for i in range(b):
|
||||
uv = list(torch.where(score_bev_avg[i] > 0))
|
||||
num_kpt = int(self.num_kpt)
|
||||
if num_kpt == 0:
|
||||
print('NO BEV key point')
|
||||
exit()
|
||||
while len(uv[0]) < num_kpt:
|
||||
uv[0] = torch.cat([uv[0], uv[0][:(num_kpt - len(uv[0]))]])
|
||||
uv[1] = torch.cat([uv[1], uv[1][:(num_kpt - len(uv[1]))]])
|
||||
score_bev0 = score_bev_avg[i, uv[0], uv[1]]
|
||||
score_bev1, idx = torch.topk(score_bev0, k=self.num_kpt)
|
||||
# cnt=max(cnt,len(uv[0]))
|
||||
# idx=torch.arange(len(uv[0])).to(x.device)
|
||||
pc = points_avg[i, :, uv[0], uv[1]].permute(1, 0)
|
||||
# pc = torch.cat([pc, pc * 0], dim=1)
|
||||
kpt = pc[idx]
|
||||
fea_kpt = feature_bev_avg[i, :, uv[0][idx], uv[1][idx]]
|
||||
pixel_kpt = grid_avg[i, uv[0][idx], uv[1][idx]]
|
||||
pixels_kpt.append(pixel_kpt)
|
||||
kpts.append(kpt.unsqueeze(0))
|
||||
feas_kpt.append(fea_kpt.unsqueeze(0))
|
||||
else:
|
||||
score_bev_max = simple_nms(score_bev, nms_radius=3)
|
||||
score_bev = score_bev.view(b, h_bev, w_bev)
|
||||
score_bev_max = score_bev_max.view(b, h_bev, w_bev)
|
||||
kpts = []
|
||||
feas_kpt = []
|
||||
pixels_kpt = []
|
||||
for i in range(b):
|
||||
uv = list(torch.where((score_bev[i] == score_bev_max[i]) & (score_bev[i] > 0)))
|
||||
num_kpt = int(self.num_kpt)
|
||||
if num_kpt == 0:
|
||||
print('NO BEV key point')
|
||||
exit()
|
||||
while len(uv[0]) < num_kpt:
|
||||
uv[0] = torch.cat([uv[0], uv[0][:(num_kpt - len(uv[0]))]])
|
||||
uv[1] = torch.cat([uv[1], uv[1][:(num_kpt - len(uv[1]))]])
|
||||
score_bev0 = score_bev[i, uv[0], uv[1]]
|
||||
# sc0 = score_bev0.cpu().detach().numpy()
|
||||
score_bev1, idx = torch.topk(score_bev0, k=self.num_kpt)
|
||||
pc = points[i, :, uv[0], uv[1]].permute(1, 0)
|
||||
# pc = torch.cat([pc, pc * 0], dim=1)
|
||||
kpt = pc[idx]
|
||||
fea_kpt = feature_bev[i, :, uv[0][idx], uv[1][idx]]
|
||||
pixel_kpt = torch.cat([uv[0][idx], uv[1][idx]]).view(2, -1).T
|
||||
pixels_kpt.append(pixel_kpt.unsqueeze(0))
|
||||
kpts.append(kpt.unsqueeze(0))
|
||||
feas_kpt.append(fea_kpt.unsqueeze(0))
|
||||
|
||||
# kpts1=torch.zeros((b,cnt,kpt.shape[1])).to(x.device)
|
||||
# feas_kpt1=torch.zeros((b,fea_kpt.shape[0],cnt)).to(x.device)
|
||||
# for i in range(b):
|
||||
# kpts1[i,:kpts[i].shape[1]]=kpts[i].squeeze(0)
|
||||
# feas_kpt1[i,:,:feas_kpt[i].shape[2]]=feas_kpt[i].squeeze(0)
|
||||
|
||||
kpts = torch.cat(kpts)
|
||||
feas_kpt = torch.cat(feas_kpt)
|
||||
pixels_kpt = torch.cat(pixels_kpt)
|
||||
if hasattr(self, 'ep'):
|
||||
feas_kpt = self.ep(kpts, feas_kpt)
|
||||
batch_dict['pixels_kpt'] = pixels_kpt
|
||||
batch_dict['score_bev'] = score_bev
|
||||
batch_dict['fea_kpt_original'] = feas_kpt
|
||||
batch_dict['fea_bev'] = feature_bev
|
||||
batch_dict['key_points'] = kpts
|
||||
if hasattr(self, 'netvlad_bev'):
|
||||
try:
|
||||
vlad_bev = self.netvlad_bev(feas_kpt.transpose(1, 2).contiguous())
|
||||
except:
|
||||
vlad_bev = self.netvlad_bev(feas_kpt.unsqueeze(3))
|
||||
batch_dict['vlad_bev'] = vlad_bev
|
||||
if ('pose_to_frame' in batch_dict.keys()) and (hasattr(self, 'uot')):
|
||||
self.uot(batch_dict)
|
||||
#################################### show bev and kpt ############################################
|
||||
if 0:
|
||||
for i in range(b):
|
||||
bevshow = x[i].permute(1, 2, 0).cpu().detach().numpy()
|
||||
bevshow = np.ascontiguousarray(bevshow[:, :, 0:3] * 255, dtype=np.uint8)
|
||||
bevshow1 = bevshow.copy()
|
||||
bevshow1[:, 1] = [255, 255, 255]
|
||||
for j in range(kpt.shape[0]):
|
||||
center = (int(uv[1][idx[j]].cpu().detach().numpy()), int(uv[0][idx[j]].cpu().detach().numpy()))
|
||||
cv2.circle(bevshow1, center, 2, (0, 0, 255), -1, cv2.LINE_AA)
|
||||
bevshow2 = np.hstack((bevshow, bevshow1))
|
||||
# cv2.namedWindow('2', cv2.WINDOW_NORMAL)
|
||||
# cv2.imshow('2', bevshow2)
|
||||
# cv2.waitKey(0)
|
||||
fig = plt.figure()
|
||||
plt.imshow(bevshow2)
|
||||
plt.show()
|
||||
#########################################################################################################
|
||||
|
||||
#################################### show match ############################################
|
||||
if 0:
|
||||
for i in range(b // 2):
|
||||
kpt1 = kpts[i]
|
||||
pose_to_frame = batch_dict['pose_to_frame'][i]
|
||||
# pose_to_frame = batch_dict['transformation'][i]
|
||||
# pose_to_frame = torch.cat((pose_to_frame, torch.tensor([0, 0, 0, 1]).view(1, 4).to(pose_to_frame.device)))
|
||||
kpt1 = (pose_to_frame @ kpt1.permute(1, 0)).permute(1, 0)
|
||||
kpt2 = kpts[i + b // 2]
|
||||
bev1 = batch_dict['bev'][i][0:3].permute(1, 2, 0)
|
||||
bev1 = np.ascontiguousarray(bev1.cpu().detach().numpy() * 255, dtype=np.uint8)
|
||||
bev2 = batch_dict['bev'][i + b // 2][0:3].permute(1, 2, 0)
|
||||
bev2 = np.ascontiguousarray(bev2.cpu().detach().numpy() * 255, dtype=np.uint8)
|
||||
pixel1 = pixels_kpt[i].cpu().detach().numpy()
|
||||
pixel2 = pixels_kpt[i + b // 2].cpu().detach().numpy()
|
||||
fea1 = feas_kpt[i].permute(1, 0).cpu().detach().numpy()
|
||||
fea2 = feas_kpt[i + b // 2].permute(1, 0).cpu().detach().numpy()
|
||||
idx1, idx2, dis = tools.nn_match(fea1, fea2, 'cosine')
|
||||
# idx11, idx21, dis1 = tools.nn_match(kpt1, kpt2, 'euclidean')
|
||||
# idx1 = idx1[dis < 0.1]
|
||||
# idx2 = idx2[dis < 0.1]
|
||||
h, w, _ = bev1.shape
|
||||
img = np.hstack((bev1, bev2))
|
||||
img[:, w] = [255, 255, 255]
|
||||
tp = 0
|
||||
img1 = img.copy()
|
||||
for j in range(len(pixel1)):
|
||||
center1 = (int(pixel1[j, 1]), int(pixel1[j, 0]))
|
||||
center2 = (int(pixel2[j, 1]) + w, int(pixel2[j, 0]))
|
||||
cv2.circle(img, center1, 2, (155, 155, 155), -1, cv2.LINE_AA)
|
||||
cv2.circle(img, center2, 2, (155, 155, 155), -1, cv2.LINE_AA)
|
||||
for j in range(len(idx1)):
|
||||
center1 = (int(pixel1[idx1[j], 1]), int(pixel1[idx1[j], 0]))
|
||||
center2 = (int(pixel2[idx2[j], 1]) + w, int(pixel2[idx2[j], 0]))
|
||||
dis_kpt = (kpt1[idx1[j]] - kpt2[idx2[j]]).norm(p=2)
|
||||
if dis_kpt < 2:
|
||||
tp = tp + 1
|
||||
cv2.line(img, center1, center2, (0, 166, 0), 1, cv2.LINE_AA)
|
||||
else:
|
||||
cv2.line(img, center1, center2, (0, 0, 188), 1, cv2.LINE_AA)
|
||||
cv2.circle(img, center1, 2, (255, 255, 255), -1, cv2.LINE_AA)
|
||||
cv2.circle(img, center2, 2, (255, 255, 255), -1, cv2.LINE_AA)
|
||||
# print(np.arccos(pose_to_frame.cpu().detach().numpy()[0, 0]) / np.pi * 180, (tp / len(idx1)))
|
||||
img2 = np.vstack((img1, img))
|
||||
img2[h, :] = [255, 255, 255]
|
||||
cv2.namedWindow('bev match %.3f,%.1fdeg' % (tp / len(idx1), np.arccos(pose_to_frame.cpu().detach().numpy()[0, 0]) / np.pi * 180))
|
||||
cv2.imshow('bev match %.3f,%.1fdeg' % (tp / len(idx1), np.arccos(pose_to_frame.cpu().detach().numpy()[0, 0]) / np.pi * 180), img2)
|
||||
cv2.waitKey(0)
|
||||
#####################################################################################################
|
||||
|
||||
############################################ ICP ##################################################
|
||||
if 0:
|
||||
import open3d as o3d
|
||||
for i in range(b // 2):
|
||||
pose_to_frame = batch_dict['pose_to_frame'][i].cpu().detach().numpy()
|
||||
print('angle', np.arccos(pose_to_frame[0, 0]) / 3.14 * 180)
|
||||
transformation = batch_dict['transformation'][i].cpu().detach().numpy()
|
||||
transformation = np.vstack((transformation, [0, 0, 0, 1]))
|
||||
scan1 = batch_dict['scan_query'][i].cpu().detach().numpy()
|
||||
scan2 = batch_dict['scan_positive'][i].cpu().detach().numpy()
|
||||
|
||||
pcd1 = o3d.geometry.PointCloud()
|
||||
pcd1.points = o3d.utility.Vector3dVector(scan1[:, :3])
|
||||
pcd1.colors = o3d.utility.Vector3dVector([[0, 0, 1] for i in range(len(pcd1.points))])
|
||||
|
||||
pcd11 = o3d.geometry.PointCloud()
|
||||
pcd11.points = o3d.utility.Vector3dVector(scan1[:, :3])
|
||||
pcd11.colors = o3d.utility.Vector3dVector([[0, 0, 1] for i in range(len(pcd1.points))])
|
||||
|
||||
pcd2 = o3d.geometry.PointCloud()
|
||||
pcd2.points = o3d.utility.Vector3dVector(scan2[:, :3])
|
||||
pcd2.colors = o3d.utility.Vector3dVector([[0, 1, 0] for i in range(len(pcd2.points))])
|
||||
|
||||
icp_config = o3d.pipelines.registration.ICPConvergenceCriteria(max_iteration=200, relative_fitness=1e-6,
|
||||
relative_rmse=1e-6)
|
||||
trans_init = transformation
|
||||
threshold = 2
|
||||
estimation_method = o3d.pipelines.registration.TransformationEstimationPointToPoint()
|
||||
registration_result = o3d.pipelines.registration.registration_icp(pcd1, pcd2, threshold, trans_init,
|
||||
estimation_method, icp_config)
|
||||
|
||||
# 将待配准点云应用变换
|
||||
pcd1.transform(registration_result.transformation)
|
||||
|
||||
vis1 = o3d.visualization.Visualizer()
|
||||
vis1.create_window(window_name='registration', width=600, height=600) # 创建窗口
|
||||
render_option: o3d.visualization.RenderOption = vis1.get_render_option() # 设置点云渲染参数
|
||||
render_option.background_color = np.array([1, 1, 1]) # 设置背景色(这里为黑色)
|
||||
render_option.point_size = 2 # 设置渲染点的大小
|
||||
vis1.add_geometry(pcd11)
|
||||
vis1.run()
|
||||
|
||||
vis2 = o3d.visualization.Visualizer()
|
||||
vis2.create_window(window_name='registration', width=600, height=600) # 创建窗口
|
||||
render_option: o3d.visualization.RenderOption = vis2.get_render_option() # 设置点云渲染参数
|
||||
render_option.background_color = np.array([1, 1, 1]) # 设置背景色(这里为黑色)
|
||||
render_option.point_size = 2 # 设置渲染点的大小
|
||||
vis2.add_geometry(pcd2)
|
||||
vis2.run()
|
||||
|
||||
vis = o3d.visualization.Visualizer()
|
||||
vis.create_window(window_name='registration', width=600, height=600) # 创建窗口
|
||||
render_option: o3d.visualization.RenderOption = vis.get_render_option() # 设置点云渲染参数
|
||||
render_option.background_color = np.array([1, 1, 1]) # 设置背景色(这里为黑色)
|
||||
render_option.point_size = 2 # 设置渲染点的大小
|
||||
vis.add_geometry(pcd1)
|
||||
vis.add_geometry(pcd2)
|
||||
vis.run()
|
||||
#######################################################################################################
|
||||
|
||||
return batch_dict
|
||||
|
||||
|
||||
class ImgHead(nn.Module):
|
||||
def __init__(self, alnet='alike-n', num_kpt=150, cluster_num=0,vlad_size=256):
|
||||
super(ImgHead, self).__init__()
|
||||
cfg = configs[alnet]
|
||||
self.feature_extractor = ALNet(c1=cfg['c1'], c2=cfg['c2'], c3=cfg['c3'], c4=cfg['c4'], dim=cfg['dim'],
|
||||
single_head=cfg['single_head'])
|
||||
self.feature_size = int(self.feature_extractor.feature_size)
|
||||
# try:
|
||||
# model_path = cfg['model_path']
|
||||
# except:
|
||||
# model_path = ''
|
||||
# if model_path != '':
|
||||
# state_dict = torch.load(model_path)
|
||||
# self.feature_extractor.load_state_dict(state_dict)
|
||||
# for param in self.feature_extractor.parameters():
|
||||
# param.requires_grad = False
|
||||
if num_kpt>0:
|
||||
self.num_kpt = num_kpt
|
||||
|
||||
def forward(self, batch_dict):
|
||||
x = batch_dict['img'][:, 0:3].float() / 255.0
|
||||
# x=x[:,:,:,384:768,]
|
||||
# pixels = batch_dict['img'][:, 3:5]
|
||||
b, c, h, w = x.shape
|
||||
pixel_features = []
|
||||
kpts = []
|
||||
scores = []
|
||||
score_img, feature_img = self.feature_extractor(x)
|
||||
# feature_img=feature_img*0
|
||||
if hasattr(self,'num_kpt') :
|
||||
score_img = simple_nms(score_img, 2, 2)
|
||||
s_thr = 0.1
|
||||
for i in range(b):
|
||||
score_global1 = score_img[i, 0]
|
||||
values, indices = torch.topk(score_global1.view(-1), k=self.num_kpt, dim=0, largest=True)
|
||||
if torch.max(values) < s_thr:
|
||||
print('0 pixel')
|
||||
exit()
|
||||
num_low_value = torch.sum(values < s_thr)
|
||||
if num_low_value > 0:
|
||||
indices1 = indices.clone()
|
||||
indices1[(self.num_kpt - num_low_value):] = indices[:num_low_value]
|
||||
indices = indices1
|
||||
row = torch.div(indices, score_global1.shape[1], rounding_mode='trunc')
|
||||
col = indices % score_global1.shape[1]
|
||||
pixel_feature = feature_img[i:i + 1, :, row, col]
|
||||
pixel_features.append(pixel_feature)
|
||||
kpts.append(torch.cat([row.view(1, -1, 1), col.view(1, -1, 1)], dim=2))
|
||||
scores.append(values.view(1, -1))
|
||||
pixel_features = torch.cat(pixel_features)
|
||||
kpts = torch.cat(kpts)
|
||||
scores = torch.cat(scores)
|
||||
#################################### show match ############################################
|
||||
if 0:
|
||||
for i in range(b // 2):
|
||||
img1 = batch_dict['img'][i][0:3].permute(1, 2, 0)
|
||||
img1 = np.ascontiguousarray(img1.cpu().detach().numpy(), dtype=np.uint8)
|
||||
img2 = batch_dict['img'][i + b // 2][0:3].permute(1, 2, 0)
|
||||
img2 = np.ascontiguousarray(img2.cpu().detach().numpy(), dtype=np.uint8)
|
||||
pixel1 = kpts[i].cpu().detach().numpy()
|
||||
pixel2 = kpts[i + b // 2].cpu().detach().numpy()
|
||||
fea1 = pixel_features[i].permute(1, 0).cpu().detach().numpy()
|
||||
fea2 = pixel_features[i + b // 2].permute(1, 0).cpu().detach().numpy()
|
||||
idx1, idx2, dis = tools.nn_match(fea1, fea2, 'euclidean')
|
||||
idx1 = idx1[dis < 10]
|
||||
idx2 = idx2[dis < 10]
|
||||
h, w, _ = img1.shape
|
||||
img = np.vstack((img1, img2))
|
||||
img[h, :] = [255, 255, 255]
|
||||
for i in range(len(idx1)):
|
||||
center1 = (int(pixel1[idx1[i], 1]), int(pixel1[idx1[i], 0]))
|
||||
center2 = (int(pixel2[idx2[i], 1]), int(pixel2[idx2[i], 0] + h))
|
||||
cv2.line(img, center1, center2, (0, 188, 0), 1, cv2.LINE_AA)
|
||||
cv2.circle(img, center1, 2, (0, 0, 255), -1, cv2.LINE_AA)
|
||||
cv2.circle(img, center2, 2, (0, 0, 255), -1, cv2.LINE_AA)
|
||||
fig = plt.figure()
|
||||
plt.imshow(img[:, :, [2, 1, 0]])
|
||||
plt.show()
|
||||
# cv2.namedWindow('img match')
|
||||
# cv2.imshow('img match', img)
|
||||
# cv2.waitKey(0)
|
||||
|
||||
#########################################################################################################
|
||||
batch_dict['key_pixels'] = kpts
|
||||
batch_dict['fea_kpl'] = pixel_features
|
||||
batch_dict['fea_img'] = feature_img
|
||||
batch_dict['score_img'] = score_img
|
||||
if hasattr(self, 'netvlad_img'):
|
||||
vlad = self.netvlad_img(pixel_features.transpose(1, 2).contiguous())
|
||||
batch_dict['vlad_img'] = vlad
|
||||
|
||||
return batch_dict
|
||||
|
||||
class LocalPool(nn.Module):
|
||||
def __init__(self, in_c):
|
||||
super().__init__()
|
||||
self.conv1 = nn.Conv2d(100, 10, 1, 1, 0, bias=True)
|
||||
self.mp=nn.MaxPool2d((1, 10))
|
||||
|
||||
def forward(self, x):
|
||||
b, c, n, k = x.shape #k=100
|
||||
x1 = x.permute(0, 3, 2, 1) # b,k,n,c
|
||||
x2=self.conv1(x1)
|
||||
x3=x2.permute(0,3,2,1)
|
||||
x4=self.mp(x3)
|
||||
return x4 # bcn1
|
||||
|
||||
class TransformerEncoder(nn.Module):
|
||||
def __init__(self, in_c=128, num_heads=4, dropout=0.1, num_layers=2):
|
||||
super().__init__()
|
||||
encoder_layer = nn.TransformerEncoderLayer(d_model=in_c, nhead=num_heads, dropout=dropout, batch_first=True)
|
||||
self.encoder = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
|
||||
|
||||
def forward(self, x):
|
||||
y = self.encoder(x)
|
||||
return y
|
||||
|
||||
|
||||
|
||||
class Attention(nn.Module):
|
||||
def __init__(self, d_model):
|
||||
super(Attention, self).__init__()
|
||||
self.d_model = d_model
|
||||
self.w_q = nn.Linear(d_model, d_model)
|
||||
self.w_k = nn.Linear(d_model, d_model)
|
||||
self.w_v = nn.Linear(d_model, d_model)
|
||||
self.fnn = nn.Linear(d_model, d_model)
|
||||
# self.dp=nn.Dropout(0.1)
|
||||
|
||||
def forward(self, q, k=None, v=None):
|
||||
proj_q = self.w_q(q) # BNC
|
||||
proj_k = self.w_k(k)
|
||||
proj_w = self.w_v(v)
|
||||
# proj_q=self.dp(proj_q)
|
||||
# proj_k=self.dp(proj_k)
|
||||
# proj_w=self.dp(proj_w)
|
||||
weights = nn.functional.softmax(torch.matmul(proj_q, proj_k.transpose(-2, -1)) / (self.d_model ** 0.5), dim=-1)
|
||||
attn_output = torch.matmul(weights, proj_w).contiguous()
|
||||
output = self.fnn(attn_output)
|
||||
return output, weights
|
||||
|
||||
|
||||
class Generator(nn.Module):
|
||||
def __init__(self, in_c=128, num=150):
|
||||
super().__init__()
|
||||
self.mha = Attention(in_c)
|
||||
self.conv1 = nn.Sequential(
|
||||
nn.ConvTranspose1d(in_c, in_c, kernel_size=3, stride=3, padding=0),
|
||||
nn.AdaptiveMaxPool1d(num)
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
b, c, n = x.shape
|
||||
# x=x.detach()
|
||||
x1 = x.permute(0, 2, 1) # BNC
|
||||
x2, _ = self.mha(x1, x1, x1)
|
||||
x2 = x2.permute(0, 2, 1)
|
||||
x3 = self.conv1(x2)
|
||||
return x3
|
||||
|
||||
|
||||
class Converter(nn.Module):
|
||||
def __init__(self, in_c=128):
|
||||
super().__init__()
|
||||
self.mha = Attention(in_c)
|
||||
self.conv1 = nn.Sequential(
|
||||
nn.Conv1d(in_c, in_c, kernel_size=1, stride=1, padding=0),# nn.BatchNorm1d(in_c), nn.ReLU(),
|
||||
nn.Conv1d(in_c, in_c // 4, kernel_size=1, stride=1, padding=0),# nn.BatchNorm1d(in_c // 4), nn.ReLU(),
|
||||
nn.Conv1d(in_c // 4, in_c // 8, kernel_size=1, stride=1, padding=0),# nn.BatchNorm1d(in_c // 8), nn.ReLU(),
|
||||
nn.Conv1d(in_c // 8, in_c // 4, kernel_size=1, stride=1, padding=0),# nn.BatchNorm1d(in_c // 4), nn.ReLU(),
|
||||
nn.Conv1d(in_c // 4, in_c, kernel_size=1, stride=1, padding=0),# nn.BatchNorm1d(in_c), nn.ReLU(),
|
||||
nn.Conv1d(in_c, in_c, kernel_size=1, stride=1, padding=0)
|
||||
)
|
||||
self.conv2 = nn.Conv1d(in_c * 2, in_c, 1, 1, 0, bias=False)
|
||||
|
||||
def forward(self, x):
|
||||
# return x
|
||||
b, c, n = x.shape
|
||||
# x=x.detach()
|
||||
mask = (x == 0).all(dim=1)
|
||||
x1 = x.permute(0, 2, 1) # BNC
|
||||
x2, _ = self.mha(x1, x1, x1)
|
||||
x2 = x2.permute(0, 2, 1)
|
||||
x3 = self.conv1(x)
|
||||
x4=torch.cat([x2,x3],dim=1)
|
||||
x5=self.conv2(x4)
|
||||
x5 = x5.masked_fill(mask.unsqueeze(1), 0)
|
||||
return x5
|
||||
|
||||
class FusionHead(nn.Module):
|
||||
def __init__(self, in_c=128):
|
||||
super().__init__()
|
||||
self.mha1 = Attention(in_c)
|
||||
self.mha2 = Attention(in_c)
|
||||
self.conv1 = nn.Conv1d(in_c * 2, in_c, 1)
|
||||
|
||||
def forward(self, x):
|
||||
fea_kpt = x[:, :, 0]
|
||||
fea_kpl_gen = x[:, :, 3]
|
||||
B, C, K, N = x.shape
|
||||
x1 = x[:, :, :3] # BC3N
|
||||
x2 = x1.permute(0, 3, 2, 1).contiguous()#BN3C
|
||||
x3 = x2.view(B * N, 3, C)
|
||||
x4, _ = self.mha1(x3, x3, x3)
|
||||
x5 = torch.max(x4, dim=1)[0]#B*N 3 C
|
||||
x6=x5.view(B,N,C)
|
||||
x7, _ = self.mha2(x6, fea_kpl_gen.permute(0, 2, 1), fea_kpl_gen.permute(0, 2, 1))
|
||||
x7 = x7.permute(0, 2, 1)
|
||||
x8 = torch.cat([fea_kpt, x7] ,dim=1)
|
||||
x9 = self.conv1(x8)
|
||||
return x9
|
||||
|
||||
|
||||
def cosine_similarity(feature1, feature2):
|
||||
# BNC
|
||||
feature1 = feature1 / torch.sqrt(torch.sum(feature1 ** 2, -1, keepdim=True) + 1e-8)
|
||||
feature2 = feature2 / torch.sqrt(torch.sum(feature2 ** 2, -1, keepdim=True) + 1e-8)
|
||||
C = torch.bmm(feature1, feature2.transpose(1, 2))
|
||||
# distance_matrix = torch.sum(feature1 ** 2, -1, keepdim=True)
|
||||
# distance_matrix = distance_matrix + torch.sum(feature2 ** 2, -1, keepdim=True).transpose(1, 2)
|
||||
# distance_matrix = distance_matrix - 2 * torch.bmm(feature1, feature2.transpose(1, 2)) # c^2=a^2+b^2-2abcos
|
||||
# C = distance_matrix ** 0.5
|
||||
return C
|
||||
|
||||
|
||||
class Fusion(nn.Module):
|
||||
def __init__(self, cfg):
|
||||
super().__init__()
|
||||
flag = cfg['flag']
|
||||
self.flag = flag
|
||||
if flag == 'fusion':
|
||||
self.img = ImgHead(alnet='alike-n', num_kpt=cfg['kpts_number_img'],
|
||||
cluster_num=cfg['cluster_num_img'], vlad_size=cfg['vlad_size'])
|
||||
self.bev = BEVHead(alnet='alike-n', iter=cfg['sinkhorn_iter'],
|
||||
num_kpt=cfg['kpts_number_bev'], cluster_num=cfg['cluster_num_bev'], vlad_size=cfg['vlad_size'])
|
||||
assert self.img.feature_size == self.bev.feature_size, 'img feature and image feature should be the same size'
|
||||
feature_size = self.img.feature_size
|
||||
self.localpool = LocalPool(feature_size)
|
||||
self.cvt_img = Converter(feature_size)
|
||||
self.cvt_bev = Converter(feature_size)
|
||||
self.gen_pan = Generator(feature_size, cfg['kpts_number_bev'])
|
||||
self.att_fusion = FusionHead(feature_size)
|
||||
# self.netvlad_fusion = NetVLADLoupe(feature_size, cfg['cluster_num_fusion'], cfg['vlad_size'])
|
||||
self.netvlad_fusion = NetVLAD(feature_size, cfg['cluster_num_fusion'])
|
||||
self.uot = UOTHead(nb_iter=cfg['sinkhorn_iter'],name='fusion')
|
||||
self.vlad='fusion'
|
||||
self.w= torch.nn.Parameter(torch.zeros(1))
|
||||
|
||||
if flag == 'bev':
|
||||
self.bev = BEVHead(alnet='alike-n',iter=cfg['sinkhorn_iter'], num_kpt=cfg['kpts_number_bev'], cluster_num=cfg['cluster_num_bev'], vlad_size=256)
|
||||
if flag == 'img':
|
||||
self.img = ImgHead(alnet='alike-n', num_kpt=cfg['kpts_number_img'], cluster_num=cfg['cluster_num_img'], vlad_size=cfg['vlad_size'])
|
||||
|
||||
def forward(self, batch_dict):
|
||||
if self.flag == 'fusion':
|
||||
batch_dict = self.img(batch_dict)
|
||||
batch_dict = self.bev(batch_dict)
|
||||
fea_img = batch_dict['fea_img']
|
||||
fea_bev = batch_dict['fea_bev']
|
||||
relation = batch_dict['relation']
|
||||
fea_kpt_original = batch_dict['fea_kpt_original']
|
||||
# fea_kpl = batch_dict['fea_kpl']
|
||||
# pixel_kpt = batch_dict['pixels_kpt']
|
||||
b, n1, n2, _ = relation.shape
|
||||
n2 = n2 - 1
|
||||
# ns=torch.sum((relation[:,:,-1]>0).all(dim=2),dim=1)
|
||||
# n_least=torch.min(ns)
|
||||
# n_least=min(n_least,256)
|
||||
# relation1=[]
|
||||
# for i in range(b):
|
||||
# idx=torch.randperm(ns[i])[:n_least].to(relation.device)
|
||||
# relation1.append(relation[i:i+1,idx])
|
||||
# relation1=torch.cat(relation1)
|
||||
# relation=relation1
|
||||
pixel_img = relation[:, :, 0:n2].clone()
|
||||
grid_img = pixel_img[:, :, :, [1, 0]].float() / torch.tensor([fea_img.shape[3] - 1, fea_img.shape[2] - 1]).to(fea_img.device).float() * 2 - 1
|
||||
fea_pl_dual = F.grid_sample(fea_img, grid_img, align_corners=True, mode='bilinear', padding_mode='zeros')
|
||||
fea_pl_dual = self.localpool(fea_pl_dual).squeeze(3)
|
||||
fea_pt_dual_gen = self.cvt_bev(fea_pl_dual)
|
||||
|
||||
if 'pose_to_frame' in batch_dict.keys() and hasattr(self, 'uot'):
|
||||
pixel_bev = relation[:, :, n2:n2 + 1, 0:2].clone()
|
||||
grid_bev = pixel_bev[:, :, :, [1, 0]].float() / torch.tensor([fea_bev.shape[3] - 1, fea_bev.shape[2] - 1]).to(fea_bev.device).float() * 2 - 1
|
||||
fea_pt_dual = (F.grid_sample(fea_bev, grid_bev, align_corners=True, mode='bilinear', padding_mode='zeros')).squeeze(3)
|
||||
fea_pl_dual_gen = self.cvt_img(fea_pt_dual)
|
||||
batch_dict['fea_pt_dual_gen'] = fea_pt_dual_gen
|
||||
batch_dict['fea_pl_dual_gen'] = fea_pl_dual_gen
|
||||
batch_dict['fea_pt_dual'] = fea_pt_dual
|
||||
batch_dict['fea_pl_dual'] = fea_pl_dual
|
||||
|
||||
fea_kpt_original_gen = self.gen_pan(fea_pt_dual_gen)
|
||||
batch_dict['fea_kpt_original_gen'] = fea_kpt_original_gen
|
||||
fea_kpl_gen = self.cvt_img(fea_kpt_original)
|
||||
fea_kpt_gen_gen = self.cvt_bev(fea_kpl_gen)
|
||||
batch_dict['fea_kpt_gen_gen'] = fea_kpt_gen_gen
|
||||
batch_dict['fea_kpl_gen']=fea_kpl_gen
|
||||
fea_kpts = torch.cat([fea_kpt_original.unsqueeze(2), fea_kpt_original_gen.unsqueeze(2), fea_kpt_gen_gen.unsqueeze(2), fea_kpl_gen.unsqueeze(2)], dim=2)
|
||||
fea_kpt_fusion = self.att_fusion(fea_kpts)
|
||||
batch_dict['fea_kpt_fusion'] = fea_kpt_original
|
||||
|
||||
# sim10 = cosine_similarity(fea_pt_dual.permute(0, 2, 1), fea_pt_dual.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim11 = cosine_similarity(fea_pt_dual_gen.permute(0, 2, 1), fea_pt_dual_gen.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim20 = cosine_similarity(fea_pl_dual.permute(0, 2, 1), fea_pl_dual.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim21 = cosine_similarity(fea_pl_dual_gen.permute(0, 2, 1), fea_pl_dual_gen.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim30 = cosine_similarity(fea_kpt_original.permute(0, 2, 1), fea_kpt_original.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim31 = cosine_similarity(fea_kpt_original_gen.permute(0, 2, 1), fea_kpt_original_gen.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# sim32 = cosine_similarity(fea_kpt_gen_gen.permute(0, 2, 1), fea_kpt_gen_gen.permute(0, 2, 1))[0].cpu().detach().numpy()
|
||||
# fig=plt.figure()
|
||||
# plt.subplot(2, 4, 1), plt.imshow(sim10), plt.title('points')
|
||||
# plt.subplot(2, 4, 5), plt.imshow(sim11), plt.title('gen points')
|
||||
# plt.subplot(2, 4, 2), plt.imshow(sim20), plt.title('pixel')
|
||||
# plt.subplot(2, 4, 6), plt.imshow(sim21), plt.title('gen pixel')
|
||||
# plt.subplot(2, 4, 3), plt.imshow(sim30), plt.title('kpt orig')
|
||||
# plt.subplot(2, 4, 7), plt.imshow(sim31), plt.title('pan kpt')
|
||||
# plt.subplot(2, 4, 4), plt.imshow(sim30), plt.title('kpt orig')
|
||||
# plt.subplot(2, 4, 8), plt.imshow(sim32), plt.title('kpt gen gen')
|
||||
# plt.show()
|
||||
|
||||
if 'pose_to_frame' in batch_dict.keys() and hasattr(self, 'uot'):
|
||||
self.uot(batch_dict)
|
||||
|
||||
vlad_fusion = self.netvlad_fusion(fea_kpt_fusion.unsqueeze(3))
|
||||
if self.vlad=='bev':
|
||||
batch_dict['vlads']=batch_dict['vlad_bev']
|
||||
if self.vlad=='fusion':
|
||||
if 'vlad_bev' in batch_dict.keys():
|
||||
batch_dict['vlads']=torch.sigmoid(self.w)*vlad_fusion + (1-torch.sigmoid(self.w))*batch_dict['vlad_bev']
|
||||
else:
|
||||
batch_dict['vlads']=vlad_fusion
|
||||
if self.flag == 'bev':
|
||||
batch_dict = self.bev(batch_dict)
|
||||
batch_dict['vlads'] = batch_dict['vlad_bev']
|
||||
if self.flag == 'img':
|
||||
batch_dict = self.img(batch_dict)
|
||||
batch_dict['vlads'] = batch_dict['vlad_img']
|
||||
return batch_dict
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
b=BEVHead()
|
||||
155
netvlad.py
Normal file
155
netvlad.py
Normal file
@@ -0,0 +1,155 @@
|
||||
import math
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
|
||||
|
||||
class NetVLAD(nn.Module):
|
||||
def __init__(self, fea_size=128, num_clusters=16):
|
||||
super(NetVLAD, self).__init__()
|
||||
self.num_clusters = num_clusters
|
||||
self.conv = nn.Conv2d(fea_size, num_clusters, kernel_size=(1, 1), bias=True)
|
||||
self.centroids = nn.Parameter(torch.randn(num_clusters, fea_size))
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def forward(self, x):
|
||||
"""
|
||||
x: B, C, H, W,W=1
|
||||
"""
|
||||
soft_assign = self.conv(x) # (B, num_clusters, H, W)
|
||||
soft_assign = self.relu(soft_assign)
|
||||
soft_assign = torch.nn.functional.softmax(soft_assign, dim=1) # (B, num_clusters, H, W)
|
||||
|
||||
# reshape for broadcasting
|
||||
B, C, H, W = x.shape
|
||||
soft_assign = soft_assign.view(B, self.num_clusters, -1) # (B, num_clusters, H * W)
|
||||
x_flatten = x.view(B, C, -1) # (B, C, H * W)
|
||||
|
||||
# compute residuals
|
||||
x_flatten1 = x_flatten.unsqueeze(1).permute(0, 1, 3, 2) # (B, 1, H*W, C)
|
||||
centroids = self.centroids.unsqueeze(0).unsqueeze(2) # (1, num_cluster, 1, C)
|
||||
residual = x_flatten1 - centroids # (B, num_clusters, H * W, C)
|
||||
|
||||
residual *= soft_assign.unsqueeze(-1) # (B, num_clusters, H * W, C)
|
||||
|
||||
# sum residuals and assign
|
||||
vlad = residual.sum(dim=-2) # (B, num_clusters, C)
|
||||
vlad = nn.functional.normalize(vlad, p=2, dim=2) # (B, num_clusters, C)
|
||||
|
||||
vlad = vlad.view(B, -1)
|
||||
vlad = nn.functional.normalize(vlad, p=2, dim=1) # (B, num_clusters * C)
|
||||
return vlad
|
||||
|
||||
|
||||
class NetVLADLoupe(nn.Module):
|
||||
"""
|
||||
Original Tensorflow implementation: https://github.com/antoine77340/LOUPE
|
||||
"""
|
||||
|
||||
def __init__(self, feature_size, cluster_size, output_dim,
|
||||
gating=True, add_norm=True, is_training=True, normalization='batch'):
|
||||
super(NetVLADLoupe, self).__init__()
|
||||
self.feature_size = feature_size
|
||||
# output_dim=cluster_size * feature_size
|
||||
self.output_dim = output_dim
|
||||
self.is_training = is_training
|
||||
self.gating = gating
|
||||
self.add_batch_norm = add_norm
|
||||
self.cluster_size = cluster_size
|
||||
if normalization == 'instance':
|
||||
norm = lambda x: nn.LayerNorm(x)
|
||||
elif normalization == 'group':
|
||||
norm = lambda x: nn.GroupNorm(8, x)
|
||||
else:
|
||||
norm = lambda x: nn.BatchNorm1d(x)
|
||||
self.softmax = nn.Softmax(dim=-1)
|
||||
self.cluster_weights = nn.Parameter(torch.randn(feature_size, cluster_size) * 1 / math.sqrt(feature_size))
|
||||
self.cluster_weights2 = nn.Parameter(torch.randn(1, feature_size, cluster_size) * 1 / math.sqrt(feature_size))
|
||||
self.hidden1_weights = nn.Parameter(torch.randn(cluster_size * feature_size, output_dim) * 1 / math.sqrt(feature_size))
|
||||
if add_norm:
|
||||
self.cluster_biases = None
|
||||
self.bn1 = norm(cluster_size)
|
||||
else:
|
||||
self.cluster_biases = nn.Parameter(torch.randn(cluster_size) * 1 / math.sqrt(feature_size))
|
||||
self.bn1 = None
|
||||
|
||||
self.bn2 = norm(output_dim)
|
||||
|
||||
if gating:
|
||||
self.context_gating = GatingContext(output_dim, add_batch_norm=add_norm, normalization=normalization)
|
||||
|
||||
def forward(self, x):
|
||||
"""
|
||||
x: B N C
|
||||
"""
|
||||
max_samples = x.shape[1]
|
||||
activation = torch.matmul(x, self.cluster_weights)
|
||||
if self.add_batch_norm:
|
||||
activation = activation.view(-1, self.cluster_size)
|
||||
activation = self.bn1(activation)
|
||||
activation = activation.view(-1, max_samples, self.cluster_size)
|
||||
else:
|
||||
activation = activation + self.cluster_biases
|
||||
activation = self.softmax(activation)
|
||||
|
||||
a_sum = activation.sum(-2, keepdim=True)
|
||||
a = a_sum * self.cluster_weights2
|
||||
|
||||
activation = torch.transpose(activation, 2, 1)
|
||||
x = x.view((-1, max_samples, self.feature_size))
|
||||
vlad = torch.matmul(activation, x)
|
||||
vlad = torch.transpose(vlad, 2, 1).contiguous()
|
||||
vlad0 = vlad - a
|
||||
|
||||
vlad1 = F.normalize(vlad0, dim=1, p=2, eps=1e-6)
|
||||
vlad2 = vlad1.view((-1, self.cluster_size * self.feature_size))
|
||||
vlad = F.normalize(vlad2, dim=1, p=2, eps=1e-6)
|
||||
|
||||
vlad = torch.matmul(vlad, self.hidden1_weights)
|
||||
|
||||
vlad = self.bn2(vlad)
|
||||
|
||||
if self.gating:
|
||||
vlad = self.context_gating(vlad)
|
||||
# vlad = vlad / vlad.norm(dim=1, keepdim=True)
|
||||
return vlad
|
||||
|
||||
|
||||
class GatingContext(nn.Module):
|
||||
"""
|
||||
Original Tensorflow implementation: https://github.com/antoine77340/LOUPE
|
||||
"""
|
||||
|
||||
def __init__(self, dim, add_batch_norm=True, normalization='batch'):
|
||||
super(GatingContext, self).__init__()
|
||||
self.dim = dim
|
||||
self.add_batch_norm = add_batch_norm
|
||||
if normalization == 'instance':
|
||||
norm = lambda x: nn.LayerNorm(x)
|
||||
elif normalization == 'group':
|
||||
norm = lambda x: nn.GroupNorm(8, x)
|
||||
else:
|
||||
norm = lambda x: nn.BatchNorm1d(x)
|
||||
self.gating_weights = nn.Parameter(torch.randn(dim, dim) * 1 / math.sqrt(dim))
|
||||
self.sigmoid = nn.Sigmoid()
|
||||
|
||||
if add_batch_norm:
|
||||
self.gating_biases = None
|
||||
self.bn1 = norm(dim)
|
||||
else:
|
||||
self.gating_biases = nn.Parameter(torch.randn(dim) * 1 / math.sqrt(dim))
|
||||
self.bn1 = None
|
||||
|
||||
def forward(self, x):
|
||||
gates = torch.matmul(x, self.gating_weights)
|
||||
|
||||
if self.add_batch_norm:
|
||||
gates = self.bn1(gates)
|
||||
else:
|
||||
gates = gates + self.gating_biases
|
||||
|
||||
gates = self.sigmoid(gates)
|
||||
|
||||
activation = x * gates
|
||||
|
||||
return activation
|
||||
105
preparedataset.py
Normal file
105
preparedataset.py
Normal file
@@ -0,0 +1,105 @@
|
||||
import os
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import tqdm
|
||||
|
||||
|
||||
def k3602k(k, k360):
|
||||
for src in [0, 3, 4, 5, 6, 7, 9, 10]:
|
||||
tgt = src + 50
|
||||
path_pose = k360 + '/data_poses/2013_05_28_drive_%04d_sync/cam0_to_world.txt' % src
|
||||
path_velo = k360 + '/data_3d_raw/2013_05_28_drive_%04d_sync/velodyne_points/data' % src
|
||||
path_calib = k360 + '/calibration/calib_cam_to_velo.txt'
|
||||
path_img = k360 + '/data_2d_raw/2013_05_28_drive_%04d_sync/image_00/data_rect' % src
|
||||
|
||||
path_pose1 = k + '/data_odometry_poses/dataset/poses/%02d.npy' % tgt
|
||||
path_velo1 = k + '/data_odometry_velodyne/dataset/sequences/%02d/velodyne' % tgt
|
||||
path_calib1 = k + '/data_odometry_calib/dataset/sequences/%02d' % tgt
|
||||
path_img1 = k + '/data_odometry_color/dataset/sequences/%02d/image_2/' % tgt
|
||||
|
||||
if not os.path.exists(path_velo1):
|
||||
os.makedirs(path_velo1)
|
||||
if not os.path.exists(path_img1):
|
||||
os.makedirs(path_img1)
|
||||
if not os.path.exists(path_calib1):
|
||||
os.makedirs(path_calib1)
|
||||
|
||||
if not os.path.exists(path_calib1 + '/calib.txt'):
|
||||
os.symlink(path_calib, path_calib1 + '/calib.txt')
|
||||
|
||||
with open(path_calib, 'r') as f:
|
||||
for line in f.readlines():
|
||||
data = np.array([float(x) for x in line.split()])
|
||||
cam0_to_velo = np.reshape(data, (3, 4))
|
||||
cam0_to_velo = np.vstack([cam0_to_velo, [0, 0, 0, 1]])
|
||||
cam0_to_velo = torch.tensor(cam0_to_velo)
|
||||
poses2 = []
|
||||
ids = []
|
||||
with open(path_pose, 'r') as f:
|
||||
for x in f:
|
||||
x = x.strip().split()
|
||||
x = [float(v) for v in x]
|
||||
ids.append(int(x[0]))
|
||||
pose = torch.zeros((4, 4), dtype=torch.float64)
|
||||
pose[0, 0:4] = torch.tensor(x[1:5])
|
||||
pose[1, 0:4] = torch.tensor(x[5:9])
|
||||
pose[2, 0:4] = torch.tensor(x[9:13])
|
||||
pose[3, 3] = 1.0
|
||||
pose = pose @ cam0_to_velo.inverse()
|
||||
poses2.append(pose.float().numpy())
|
||||
pose = np.stack(poses2)
|
||||
np.save(path_pose1, pose)
|
||||
|
||||
cnt = 0
|
||||
for i in tqdm.tqdm(ids, desc='%02d:' % src):
|
||||
path_velo_now = os.path.join(path_velo, '0000%06d.bin' % i)
|
||||
path_img_now = os.path.join(path_img, '0000%06d.png' % i)
|
||||
if os.path.exists(path_velo_now) and os.path.exists(path_img_now):
|
||||
pass
|
||||
else:
|
||||
break
|
||||
path_velo_now1 = os.path.join(path_velo1, '%06d.bin' % cnt)
|
||||
path_img_now1 = os.path.join(path_img1, '%06d.png' % cnt)
|
||||
if not os.path.exists(path_velo_now1):
|
||||
os.symlink(path_velo_now, path_velo_now1)
|
||||
if not os.path.exists(path_img_now1):
|
||||
os.symlink(path_img_now, path_img_now1)
|
||||
cnt = cnt + 1
|
||||
|
||||
|
||||
def todataset(kitti_root, dataset_root):
|
||||
sequences = [0, 5, 6, 7, 8, 9, 50, 54, 55, 56, 59]
|
||||
for s in sequences:
|
||||
if s >= 50:
|
||||
suffix = '.npy'
|
||||
else:
|
||||
suffix = '.txt'
|
||||
kitti_velo_dir = kitti_root + '/data_odometry_velodyne/dataset/sequences/%02d/velodyne' % s
|
||||
kitti_img_dir = kitti_root + '/data_odometry_color/dataset/sequences/%02d/image_2' % s
|
||||
kitti_pose_dir = kitti_root + '/data_odometry_poses/dataset/poses/%02d' % s + suffix
|
||||
kitti_calib_dir = kitti_root + '/data_odometry_calib/dataset/sequences/%02d/calib.txt' % s
|
||||
dataset_path = dataset_root + '/%02d' % s
|
||||
if not os.path.exists(dataset_path):
|
||||
os.makedirs(dataset_path)
|
||||
dataset_velo_dir = dataset_path + '/velodyne'
|
||||
dataset_img_dir = dataset_path + '/image_2'
|
||||
dataset_pose_dir = dataset_path + '/poses' + suffix
|
||||
dataset_calib_dir = dataset_path + '/calib.txt'
|
||||
if not os.path.exists(dataset_velo_dir):
|
||||
os.symlink(kitti_velo_dir, dataset_velo_dir)
|
||||
if not os.path.exists(dataset_img_dir):
|
||||
os.symlink(kitti_img_dir, dataset_img_dir)
|
||||
if not os.path.exists(dataset_pose_dir):
|
||||
os.symlink(kitti_pose_dir, dataset_pose_dir)
|
||||
if not os.path.exists(dataset_calib_dir):
|
||||
os.symlink(kitti_calib_dir, dataset_calib_dir)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
k360 = '/home/coop8/chenyouyuan/FUSIONLCD/KITTI360'
|
||||
k = '/home/coop8/chenyouyuan/FUSIONLCD/KITTI'
|
||||
k3602k(k, k360)
|
||||
kitti_root = '/home/coop8/chenyouyuan/FUSIONLCD/KITTI'
|
||||
dataset_root = '/home/coop8/chenyouyuan/FUSIONLCD/sequences'
|
||||
todataset(kitti_root, dataset_root)
|
||||
149
tools.py
Normal file
149
tools.py
Normal file
@@ -0,0 +1,149 @@
|
||||
import threading
|
||||
import torch
|
||||
import os
|
||||
import time
|
||||
|
||||
|
||||
def farthest_point_sample(xyz, npoint):
|
||||
"""Iterative farthest point sampling
|
||||
|
||||
Args:
|
||||
xyz: pointcloud data_loader, [B, N, C]
|
||||
npoint: number of samples
|
||||
Returns:
|
||||
centroids: sampled pointcloud index, [B, npoint]
|
||||
"""
|
||||
device = xyz.device
|
||||
B, N, C = xyz.shape
|
||||
centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)
|
||||
distance = torch.ones(B, N).to(device) * 1e10
|
||||
farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device)
|
||||
batch_indices = torch.arange(B, dtype=torch.long).to(device)
|
||||
for i in range(npoint):
|
||||
centroids[:, i] = farthest
|
||||
centroid = xyz[batch_indices, farthest, :].view(B, 1, 3)
|
||||
dist = torch.sum((xyz - centroid) ** 2, -1)
|
||||
mask = dist < distance
|
||||
distance[mask] = dist[mask]
|
||||
farthest = torch.max(distance, -1)[1]
|
||||
return centroids
|
||||
|
||||
|
||||
def batch_distance(feature1,feature2,mode='cosine'):
|
||||
if mode == 'cosine':
|
||||
# Transport cost matrix
|
||||
feature1 = feature1 / torch.sqrt(torch.sum(feature1 ** 2, -1, keepdim=True) + 1e-8)
|
||||
feature2 = feature2 / torch.sqrt(torch.sum(feature2 ** 2, -1, keepdim=True) + 1e-8)
|
||||
dis = 1.0 - torch.bmm(feature1, feature2.transpose(1, 2))
|
||||
elif mode == 'euclidean':
|
||||
feature=torch.cat([feature1,feature2],dim=1)
|
||||
feature_mean=torch.mean(feature,dim=1,keepdim=True)
|
||||
feature1=feature1-feature_mean
|
||||
feature2=feature2-feature_mean
|
||||
distance_matrix = torch.sum(feature1 ** 2, -1, keepdim=True)
|
||||
distance_matrix = distance_matrix + torch.sum(feature2 ** 2, -1, keepdim=True).transpose(1, 2)
|
||||
distance_matrix = distance_matrix - 2 * torch.bmm(feature1, feature2.transpose(1, 2)) # c^2=a^2+b^2-2abcos
|
||||
distance_matrix = distance_matrix ** 0.5
|
||||
dis = distance_matrix
|
||||
return dis
|
||||
|
||||
def nn_match(fea1, fea2, matrix='cosine'):
|
||||
assert len(fea1.shape) == 2 and len(fea2.shape) == 2, 'nnmatch error'
|
||||
if not isinstance(fea1, torch.Tensor):
|
||||
fea1 = torch.tensor(fea1)
|
||||
if not isinstance(fea2, torch.Tensor):
|
||||
fea2 = torch.tensor(fea2)
|
||||
if matrix == 'cosine':
|
||||
# Transport cost matrix
|
||||
fea1 = fea1 / torch.sqrt(torch.sum(fea1 ** 2, -1, keepdim=True) + 1e-8)
|
||||
fea2 = fea2 / torch.sqrt(torch.sum(fea2 ** 2, -1, keepdim=True) + 1e-8)
|
||||
dis = 1.0 - torch.mm(fea1, fea2.transpose(0, 1))
|
||||
elif matrix == 'euclidean':
|
||||
distance_matrix = torch.sum(fea1 ** 2, -1, keepdim=True)
|
||||
distance_matrix = distance_matrix + torch.sum(fea2 ** 2, -1, keepdim=True).transpose(0, 1)
|
||||
distance_matrix = distance_matrix - 2 * torch.mm(fea1, fea2.transpose(0, 1)) # c^2=a^2+b^2-2abcos
|
||||
dis = distance_matrix ** 0.5
|
||||
else:
|
||||
dis = 0
|
||||
print('Invalid matrix')
|
||||
idx0_min = torch.argmin(dis, dim=0)
|
||||
idx1_min = torch.argmin(dis, dim=1)
|
||||
ids1 = torch.arange(0, dis.shape[1]).to(fea1.device)
|
||||
idx = idx1_min[idx0_min]
|
||||
idx_match = ids1 == idx
|
||||
idx1 = ids1[idx_match]
|
||||
idx2 = idx0_min[idx_match]
|
||||
dis_min = dis[idx2, idx1]
|
||||
|
||||
return idx2, idx1, dis_min
|
||||
|
||||
|
||||
def path_join(*args):
|
||||
names = list(args)
|
||||
path = names[0]
|
||||
for i in range(len(names) - 1):
|
||||
path = os.path.join(path, names[i + 1])
|
||||
path = list(path)
|
||||
while "\\" in path:
|
||||
idx = path.index("\\")
|
||||
path[idx] = "/"
|
||||
path = ''.join(path)
|
||||
return path
|
||||
|
||||
|
||||
def make_save_path(*args):
|
||||
path = path_join(*args)
|
||||
if not os.path.exists(path):
|
||||
os.makedirs(path)
|
||||
return path
|
||||
|
||||
|
||||
|
||||
def read_cfg(data):
|
||||
if type(data) is int:
|
||||
result = [data]
|
||||
else:
|
||||
result = data.split(',')
|
||||
return result
|
||||
|
||||
|
||||
class Timer:
|
||||
"""A module to record the program running time"""
|
||||
|
||||
def __init__(self, name="Now"):
|
||||
self.strat = time.time()
|
||||
self.cnt = 0
|
||||
self.end = time.time()
|
||||
self.avg = 0
|
||||
self.all = 0
|
||||
self.now = 0
|
||||
self.name = name
|
||||
time_now = time.strftime('%Y%m%d_%H%M%S', time.localtime(time.time()))
|
||||
print('Init timer: ',time_now)
|
||||
|
||||
def update(self, name=None):
|
||||
if name is not None:
|
||||
self.name = name
|
||||
self.cnt = self.cnt + 1
|
||||
self.end = time.time()
|
||||
self.avg = (self.end - self.strat) / self.cnt
|
||||
self.now = self.end - self.all - self.strat
|
||||
self.all = self.end - self.strat
|
||||
time_now = time.strftime('%Y%m%d_%H%M%S', time.localtime(time.time()))
|
||||
if self.avg<1:
|
||||
print("%s | %s | using %d | each %.3f" %
|
||||
(time_now, self.name, self.all, self.now))
|
||||
elif self.avg<10:
|
||||
print("%s | %s | using %d | each %.2f" %
|
||||
(time_now, self.name, self.all, self.now))
|
||||
elif self.avg<100:
|
||||
print("%s | %s | using %d | each %.1f" %
|
||||
(time_now, self.name, self.all, self.now))
|
||||
else:
|
||||
print("%s | %s | using %d | each %d" %
|
||||
(time_now, self.name, self.all, self.now))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# draw_trace()
|
||||
pass
|
||||
432
train.py
Normal file
432
train.py
Normal file
@@ -0,0 +1,432 @@
|
||||
import argparse
|
||||
import os
|
||||
import time
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.optim as optim
|
||||
import yaml
|
||||
|
||||
import net
|
||||
import tools
|
||||
from dataset import KittiTotalLoader
|
||||
from evaluate_lcd import lcd
|
||||
from loss import TotalLoss
|
||||
|
||||
test_step = 10 # 保存测试点的步长
|
||||
|
||||
def save_checkpoint(model, optimizer, loss_total_fun, epoch, iter_train, path_result):
|
||||
if (epoch + 1) % test_step == 0 and epoch+1>=test_step:
|
||||
|
||||
time_now = time.strftime('%Y%m%d_%H%M%S', time.localtime(time.time()))
|
||||
checkpoint = {'time': time_now,
|
||||
'epoch': epoch,
|
||||
'model': model.state_dict(),
|
||||
'optimizer': optimizer.state_dict()}
|
||||
savepath = tools.make_save_path(path_result, 'models')
|
||||
torch.save(checkpoint, savepath + '/checkpoint_%03d.pth.tar' % epoch)
|
||||
print(savepath + '/checkpoint_%03d.pth.tar is saved' % epoch)
|
||||
|
||||
class log_result():
|
||||
def __init__(self,path_result):
|
||||
self.path=path_result
|
||||
if not os.path.exists(path_result):
|
||||
with open(path_result, 'w') as file:
|
||||
file.write('Time Sequence Epoch AP R100 F1 R@1 R@2 R@3 R@4 R@5')
|
||||
file.write(' R@6 R@7 R@8 R@9 R@10 R@15 R@20 R@25\n')
|
||||
for i in range(300):
|
||||
file.write('\n')
|
||||
def write(self,seq,epoch,row,x):
|
||||
with open(self.path, 'r') as file:
|
||||
lines = file.readlines()
|
||||
time_now = time.strftime('%Y%m%d%H%M%S', time.localtime(time.time()))
|
||||
new_content='%s %08d %06d'%(time_now,seq,epoch)
|
||||
for x1 in x:
|
||||
new_content=new_content + ' %.3f'%x1
|
||||
|
||||
lines[row] = new_content+'\n'
|
||||
with open(self.path, 'w') as file:
|
||||
file.writelines(lines)
|
||||
|
||||
def train(model, optimizer, loss_total_fun, data, device):
|
||||
model.train()
|
||||
sequences = data['sequence']
|
||||
id_query = data['id_query']
|
||||
id_positive = data['id_positive']
|
||||
batchsize = len(id_query)
|
||||
|
||||
bev_query = data['bev_query'].to(device)
|
||||
bev_positive = data['bev_positive'].to(device)
|
||||
|
||||
pose_query = data['pose_query'].to(device)
|
||||
pose_positive = data['pose_positive'].to(device)
|
||||
|
||||
pose_to_frame = data['pose_to_frame'].to(device)
|
||||
label_score = data['label_score'].to(device)
|
||||
|
||||
img_query = data['img_query'].to(device)
|
||||
img_positive = data['img_positive'].to(device)
|
||||
try:
|
||||
bev = torch.cat([bev_query, bev_positive], dim=0)
|
||||
bev = bev.permute(0, 3, 1, 2)
|
||||
except:
|
||||
bev = 0
|
||||
try:
|
||||
img = torch.cat([img_query, img_positive], dim=0)
|
||||
img = img.permute(0, 3, 1, 2)
|
||||
except:
|
||||
img = 0
|
||||
try:
|
||||
relation = data['relation'].to(device)
|
||||
except:
|
||||
relation = 0
|
||||
|
||||
batch_dict = {'bev': bev,
|
||||
'label_score': label_score,
|
||||
'img': img,
|
||||
'relation': relation,
|
||||
'id_query': id_query,
|
||||
'sequence': sequences,
|
||||
'id_positive': id_positive,
|
||||
'pose_to_frame': pose_to_frame,
|
||||
'pose_query': pose_query,
|
||||
'pose_positive': pose_positive,
|
||||
'batch_size': int(batchsize * 2)}
|
||||
|
||||
model(batch_dict)
|
||||
|
||||
loss_total_fun(batch_dict)
|
||||
l_total = batch_dict['loss'][0]
|
||||
|
||||
optimizer.zero_grad()
|
||||
l_total.backward()
|
||||
optimizer.step()
|
||||
for p in model.parameters():
|
||||
if torch.isnan(p).any():
|
||||
print('Model NAN, ', p.shape)
|
||||
exit()
|
||||
return batch_dict
|
||||
|
||||
|
||||
def validate(model, loss_total_fun, data, device):
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
sequences = data['sequence']
|
||||
id_query = data['id_query']
|
||||
id_positive = data['id_positive']
|
||||
batchsize = len(id_query)
|
||||
|
||||
bev_query = data['bev_query'].to(device)
|
||||
bev_positive = data['bev_positive'].to(device)
|
||||
|
||||
pose_query = data['pose_query'].to(device)
|
||||
pose_positive = data['pose_positive'].to(device)
|
||||
|
||||
pose_to_frame = data['pose_to_frame'].to(device)
|
||||
label_score = data['label_score'].to(device)
|
||||
|
||||
img_query = data['img_query'].to(device)
|
||||
img_positive = data['img_positive'].to(device)
|
||||
try:
|
||||
bev = torch.cat([bev_query, bev_positive], dim=0)
|
||||
bev = bev.permute(0, 3, 1, 2)
|
||||
except:
|
||||
bev = 0
|
||||
try:
|
||||
img = torch.cat([img_query, img_positive], dim=0)
|
||||
img = img.permute(0, 3, 1, 2)
|
||||
except:
|
||||
img = 0
|
||||
try:
|
||||
relation = data['relation'].to(device)
|
||||
except:
|
||||
relation = 0
|
||||
|
||||
batch_dict = {'bev': bev,
|
||||
'label_score': label_score,
|
||||
'img': img,
|
||||
'relation': relation,
|
||||
'id_query': id_query,
|
||||
'sequence': sequences,
|
||||
'id_positive': id_positive,
|
||||
'pose_to_frame': pose_to_frame,
|
||||
'pose_query': pose_query,
|
||||
'pose_positive': pose_positive,
|
||||
'batch_size': int(batchsize * 2)}
|
||||
model(batch_dict)
|
||||
loss_total_fun(batch_dict)
|
||||
|
||||
return batch_dict
|
||||
|
||||
|
||||
def test(model, data, device):
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
sequences = data['sequence']
|
||||
id_query = data['id_query']
|
||||
batchsize = len(id_query)
|
||||
bev_query = data['bev_query'].to(device)
|
||||
pose_query = data['pose_query'].to(device)
|
||||
|
||||
img_query = data['img_query'].to(device)
|
||||
try:
|
||||
bev = bev_query
|
||||
bev = bev.permute(0, 3, 1, 2)
|
||||
except:
|
||||
bev = 0
|
||||
try:
|
||||
img = img_query
|
||||
img = img.permute(0, 3, 1, 2)
|
||||
except:
|
||||
img = 0
|
||||
try:
|
||||
relation = data['relation'].to(device)
|
||||
except:
|
||||
relation = 0
|
||||
|
||||
batch_dict = {'bev': bev,
|
||||
'img': img,
|
||||
'relation': relation,
|
||||
'id_query': id_query,
|
||||
'sequence': sequences,
|
||||
'pose_query': pose_query,
|
||||
'batch_size': int(batchsize * 2)}
|
||||
|
||||
model(batch_dict)
|
||||
|
||||
return batch_dict
|
||||
|
||||
|
||||
def main(args):
|
||||
try:
|
||||
with open(os.path.join(os.getcwd(), "config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "config.yaml"))
|
||||
except:
|
||||
with open(os.path.join(os.getcwd(), "project/FUSIONLCD/config.yaml"), "r") as ymlfile:
|
||||
cfg = yaml.load(ymlfile, Loader=yaml.SafeLoader)
|
||||
print('Loading config file from %s' % os.path.join(os.getcwd(), "project/BevNvLcd/config.yaml"))
|
||||
cfg = cfg['experiment']
|
||||
for k, v in cfg.items():
|
||||
print(k, ':', v)
|
||||
path_result = os.path.join(cfg['path_result'],args.result_name)
|
||||
lres=log_result(os.path.join(os.getcwd(),'result',args.result_name+'.txt'))
|
||||
device = torch.device("cuda" if torch.cuda.is_available() and cfg['cuda'] else "cpu")
|
||||
start_epoch = 0
|
||||
iter_train = 0
|
||||
epochs = cfg['epochs']
|
||||
model = net.Fusion(cfg)
|
||||
print(model)
|
||||
model = model.to(device)
|
||||
loss_total_fun = TotalLoss(cfg).to(device)
|
||||
print("Model params: %.6fM" % (sum(p.numel() for p in model.parameters()) / 1e6))
|
||||
optimizer = optim.Adam(model.parameters(), lr=cfg['learning_rate'], betas=(cfg['beta1'], cfg['beta2']), eps=cfg['eps'], weight_decay=cfg['weight_decay'])
|
||||
# optimizer = optim.Adam([{'params': model.bev.parameters(), 'lr': 0.0002},
|
||||
# {'params': model.img.parameters(), 'lr': 0.0001},
|
||||
# {'params': model.vlad_fusion_layer.parameters(), 'lr': 0.0001}],
|
||||
# betas=(cfg['beta1'], cfg['beta2']), eps=cfg['eps'], weight_decay=cfg['weight_decay'])
|
||||
# print(optimizer)
|
||||
loader_train, loader_val, loader_test = KittiTotalLoader(cfg)
|
||||
# scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[1, 3, 5, 10, 50, 100], gamma=0.5, last_epoch=start_epoch - 1)
|
||||
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1, 0.99)
|
||||
# scheduler = warmup(optimizer, 5, 1e-6, cfg['learning_rate'])
|
||||
# writer = SummaryWriter(tools.make_save_path(path_result, 'tensorboard_log'))
|
||||
t = tools.Timer()
|
||||
test_best = np.zeros([len(loader_test.dataset.datasets), 3])
|
||||
if cfg['load_model']:
|
||||
checkpoint = torch.load((cfg['last_model']))
|
||||
start_epoch = checkpoint['epoch'] + 1 * cfg['train_flag']
|
||||
state_dict_saved = checkpoint['model']
|
||||
model.load_state_dict(state_dict_saved)
|
||||
optimizer.load_state_dict(checkpoint['optimizer'])
|
||||
print('loaded %s' % cfg['last_model'])
|
||||
if not cfg['train_flag']:
|
||||
print_frequency = 1e9
|
||||
else:
|
||||
print_frequency = 1
|
||||
for epoch in range(start_epoch, epochs):
|
||||
torch.cuda.empty_cache()
|
||||
'''
|
||||
============================== train ===============================
|
||||
'''
|
||||
if cfg['train_flag']:
|
||||
if epoch - start_epoch == 0:
|
||||
pf = print_frequency
|
||||
print_frequency = min(len(loader_train), print_frequency * 10)
|
||||
else:
|
||||
print_frequency = pf
|
||||
l0, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10,l11 = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
step_print = max(1, int(len(loader_train) / print_frequency))
|
||||
step_now = 0
|
||||
optimizer.zero_grad()
|
||||
for id_sample, data in enumerate(loader_train):
|
||||
batch_dict = train(model, optimizer, loss_total_fun, data, device)
|
||||
|
||||
# if (id_sample+1)%4==0 or (id_sample+1)==len(loader_train):
|
||||
# optimizer.step()
|
||||
# optimizer.zero_grad()
|
||||
if step_now < step_print:
|
||||
step_now = step_now + 1
|
||||
l0 = l0 + batch_dict['loss'][0]
|
||||
l1 = l1 + batch_dict['loss'][1]
|
||||
l2 = l2 + batch_dict['loss'][2]
|
||||
l3 = l3 + batch_dict['loss'][3]
|
||||
l4 = l4 + batch_dict['loss'][4]
|
||||
l5 = l5 + batch_dict['loss'][5]
|
||||
l6 = l6 + batch_dict['loss'][6]
|
||||
l7 = l7 + batch_dict['loss'][7]
|
||||
l8 = l8 + batch_dict['loss'][8]
|
||||
l9 = l9 + batch_dict['loss'][9]
|
||||
l10 = l10 + batch_dict['loss'][10]
|
||||
l11 = l11 + batch_dict['loss'][11]
|
||||
if step_now == step_print:
|
||||
step_now = 0
|
||||
info = 'loss a%.3f p%.3f s%.3f m%.3f t%.3f tr%.3f_%.1f genb%.3f geni%.3f genpa%.3f genpo%.3f genkpl%.3f' % (
|
||||
l0 / step_print, l1 / step_print, l2 / step_print, l3 / step_print,
|
||||
l4 / step_print, l5 / step_print, l6 / step_print, l7 / step_print,
|
||||
l8 / step_print, l9 / step_print, l10 / step_print, l11 / step_print)
|
||||
t.update("Epoch %03d | train %04d/%04d | %s" %
|
||||
(epoch, id_sample, len(loader_train) - 1, info))
|
||||
l0, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11 = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
save_checkpoint(model, optimizer, loss_total_fun, epoch, iter_train, path_result)
|
||||
scheduler.step()
|
||||
'''
|
||||
============================= validate =============================
|
||||
'''
|
||||
if cfg['validate_flag'] and (epoch + 1) % test_step == 0:
|
||||
l0, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10,l11 = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
step_print = max(1, int(len(loader_val) / print_frequency))
|
||||
step_now = 0
|
||||
for id_sample, data in enumerate(loader_val):
|
||||
batch_dict = validate(model, loss_total_fun, data, device)
|
||||
if step_now < step_print:
|
||||
step_now = step_now + 1
|
||||
l0 = l0 + batch_dict['loss'][0]
|
||||
l1 = l1 + batch_dict['loss'][1]
|
||||
l2 = l2 + batch_dict['loss'][2]
|
||||
l3 = l3 + batch_dict['loss'][3]
|
||||
l4 = l4 + batch_dict['loss'][4]
|
||||
l5 = l5 + batch_dict['loss'][5]
|
||||
l6 = l6 + batch_dict['loss'][6]
|
||||
l7 = l7 + batch_dict['loss'][7]
|
||||
l8 = l8 + batch_dict['loss'][8]
|
||||
l9 = l9 + batch_dict['loss'][9]
|
||||
l10 = l10 + batch_dict['loss'][10]
|
||||
l11 = l11 + batch_dict['loss'][11]
|
||||
if step_now == step_print:
|
||||
step_now = 0
|
||||
info = 'loss a%.3f p%.3f s%.3f m%.3f t%.3f tr%.3f_%.1f genb%.3f geni%.3f genpa%.3f genpo%.3f genkpl%.3f' % (
|
||||
l0 / step_print, l1 / step_print, l2 / step_print, l3 / step_print,
|
||||
l4 / step_print, l5 / step_print, l6 / step_print, l7 / step_print,
|
||||
l8 / step_print, l9 / step_print, l10 / step_print, l11 / step_print)
|
||||
t.update("Epoch %03d | validate %04d/%04d | %s" %
|
||||
(epoch, id_sample, len(loader_val) - 1, info))
|
||||
l0, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10,l11 = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
'''
|
||||
============================== test ================================
|
||||
'''
|
||||
if cfg['test_flag'] and (epoch + 1) % test_step == 0:
|
||||
step_print = max(1, int(len(loader_test) / print_frequency))
|
||||
step_now = 0
|
||||
vlads = []
|
||||
kpts = []
|
||||
feas_original = []
|
||||
feas_fusion = []
|
||||
sequences = []
|
||||
poses = []
|
||||
for id_sample, data in enumerate(loader_test):
|
||||
batch_dict = test(model, data, device)
|
||||
# save_figure(batch_dict, epoch, path_result, cfg)
|
||||
sequences.append((batch_dict['sequence']).detach().cpu())
|
||||
vlads.append(batch_dict['vlads'].detach().cpu())
|
||||
poses.append(batch_dict['pose_query'].detach().cpu())
|
||||
kpts.append(batch_dict['key_points'].detach().cpu())
|
||||
if 'fea_kpt_fusion' in batch_dict.keys():
|
||||
feas_fusion.append(batch_dict['fea_kpt_fusion'].detach().cpu().permute(0, 2, 1))
|
||||
feas_original.append(batch_dict['fea_kpt_original'].detach().cpu().permute(0, 2, 1))
|
||||
|
||||
if step_now < step_print:
|
||||
step_now = step_now + 1
|
||||
if step_now == step_print:
|
||||
step_now = 0
|
||||
t.update("Epoch %03d | test %05d/%05d" % (epoch, id_sample, len(loader_test)))
|
||||
vlads = torch.cat(vlads)
|
||||
kpts = torch.cat(kpts)
|
||||
feas_original = torch.cat(feas_original)
|
||||
if 'fea_kpt_fusion' in batch_dict.keys():
|
||||
feas_fusion = torch.cat(feas_fusion)
|
||||
else:
|
||||
feas_fusion=feas_original
|
||||
|
||||
poses = torch.cat(poses)
|
||||
sequences = torch.cat(sequences)
|
||||
|
||||
database = {'vlads': vlads,
|
||||
'key_points': kpts,
|
||||
'fea_kpt_original': feas_original,
|
||||
'fea_kpt_fusion': feas_fusion,
|
||||
'fea_kpt': feas_fusion,
|
||||
'sequences': sequences,
|
||||
'pose_query': poses}
|
||||
savepath = tools.make_save_path(path_result, 'database')
|
||||
|
||||
torch.save(database, savepath + '/database_bevp.pth.tar')
|
||||
# print('save ' + savepath + '/database_%03d.pth.tar' % epoch)
|
||||
# exit()
|
||||
# database = torch.load('/data4/caodanyang/results/FUSIONLCD/07250/database/database_159.pth.tar')
|
||||
print()
|
||||
print('***************************************************************************************************************************************')
|
||||
print('Epoch %03d' % epoch)
|
||||
# feature_match(loader_val,database)
|
||||
result,recall_at_k = lcd(database)
|
||||
seq = torch.unique(sequences)
|
||||
for i in range((test_best.shape[0])):
|
||||
recall_at_k1=recall_at_k[i]
|
||||
for j in range((test_best.shape[1])):
|
||||
test_best[i, j] = max([test_best[i, j], result[i][j]])
|
||||
print('Best, sequence %02d, AP=%.3f, R100=%.3f, F1=%.3f' % (seq[i], test_best[i, 0], test_best[i, 1],test_best[i, 2]))
|
||||
lres.write(seq[i],epoch,(epoch+1)//test_step+i*(epochs)//test_step,
|
||||
[result[i][0],result[i][1],result[i][2],recall_at_k1[0],recall_at_k1[1],recall_at_k1[2],recall_at_k1[3],recall_at_k1[4],
|
||||
recall_at_k1[5],recall_at_k1[6],recall_at_k1[7],recall_at_k1[8],recall_at_k1[9],recall_at_k1[14],recall_at_k1[19],recall_at_k1[24]
|
||||
])
|
||||
# print('Sequence %02d, AP=%.3f[%.3f], R100=%.3f[%.3f], F1=%.3f[%.3f], Recall@1[%.3f] 2[%.3f] 5[%.3f] 10[%.3f] 15[%.3f] 25[%.3f]' %
|
||||
# (sequences[i], result[i][0], test_best[i, 0], result[i][1], test_best[i, 1], result[i][2], test_best[i, 2],
|
||||
# recall_at_k1[0],recall_at_k1[1],recall_at_k1[4],recall_at_k1[9],recall_at_k1[14],recall_at_k1[24]))
|
||||
print('***************************************************************************************************************************************')
|
||||
print()
|
||||
if cfg['train_flag']:
|
||||
pass
|
||||
else:
|
||||
exit()
|
||||
# exit()
|
||||
|
||||
if __name__ == '__main__':
|
||||
# CUDA_VISIBLE_DEVICES=2 nohup python -u train.py --result_name=08280 --info=cosim >log/08280.log 2>&1 &
|
||||
# fuser /dev/nvidia*
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--result_name', type=str, default='log', help='log name of result')
|
||||
parser.add_argument('--pro_name', type=str, default='python', help='name of process')
|
||||
parser.add_argument('--info', type=str, default='python', help='name of process')
|
||||
parser.add_argument('--gpu', type=str, default=None, help="GPU id(s), e.g. '0' or '0,1'. Use 'cpu' to force CPU.")
|
||||
args = parser.parse_args()
|
||||
|
||||
# set visible GPUs before any CUDA call / seed
|
||||
if args.gpu:
|
||||
if args.gpu.lower() == 'cpu':
|
||||
# force CPU by hiding GPUs
|
||||
os.environ['CUDA_VISIBLE_DEVICES'] = ''
|
||||
else:
|
||||
os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
|
||||
|
||||
np.random.seed(123)
|
||||
torch.manual_seed(123)
|
||||
# only call cuda seed if CUDA visible
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.manual_seed(123)
|
||||
|
||||
print(args.info)
|
||||
try:
|
||||
print("Using GPU device:", os.environ.get("CUDA_VISIBLE_DEVICES", ""))
|
||||
except:
|
||||
pass
|
||||
main(args)
|
||||
150
uot.py
Normal file
150
uot.py
Normal file
@@ -0,0 +1,150 @@
|
||||
import math
|
||||
import torch
|
||||
import torch._utils
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
from torchvision.models import resnet
|
||||
from typing import Optional, Callable
|
||||
|
||||
|
||||
def compute_rigid_transform(points1, points2, weights):
|
||||
"""Compute rigid transforms between two point clouds via weighted SVD.
|
||||
Adapted from https://github.com/yewzijian/RPMNet/
|
||||
Args:
|
||||
points1 (torch.Tensor): (B, M, 3) coordinates of the first point cloud
|
||||
points2 (torch.Tensor): (B, N, 3) coordinates of the second point cloud
|
||||
weights (torch.Tensor): (B, M)
|
||||
Returns:
|
||||
Transform T (B, 3, 4) to get from points1 to points2, i.e. T*points1 = points2
|
||||
"""
|
||||
|
||||
weights_normalized = weights[..., None] / (torch.sum(weights[..., None], dim=1, keepdim=True) + 1e-5)
|
||||
centroid_a = torch.sum(points1 * weights_normalized, dim=1)
|
||||
centroid_b = torch.sum(points2 * weights_normalized, dim=1)
|
||||
a_centered = points1 - centroid_a[:, None, :]
|
||||
b_centered = points2 - centroid_b[:, None, :]
|
||||
cov = a_centered.transpose(-2, -1) @ (b_centered * weights_normalized)
|
||||
|
||||
# Compute rotation using Kabsch algorithm. Will compute two copies with +/-V[:,:3]
|
||||
# and choose based on determinant to avoid flips
|
||||
u, s, v = torch.svd(cov, some=False, compute_uv=True)
|
||||
rot_mat_pos = v @ u.transpose(-1, -2)
|
||||
v_neg = v.clone()
|
||||
v_neg[:, :, 2] *= -1
|
||||
rot_mat_neg = v_neg @ u.transpose(-1, -2)
|
||||
rot_mat = torch.where(torch.det(rot_mat_pos)[:, None, None] > 0, rot_mat_pos, rot_mat_neg)
|
||||
assert torch.all(torch.det(rot_mat) > 0)
|
||||
|
||||
# Compute translation (uncenter centroid)
|
||||
translation = -rot_mat @ centroid_a[:, :, None] + centroid_b[:, :, None]
|
||||
|
||||
transform = torch.cat((rot_mat, translation), dim=2)
|
||||
return transform
|
||||
|
||||
|
||||
def sinkhorn_unbalanced(feature1, feature2, epsilon, gamma, max_iter, matrix='cosine'):
|
||||
"""
|
||||
Sinkhorn algorithm for Unbalanced Optimal Transport.
|
||||
Modified from https://github.com/valeoai/FLOT/
|
||||
Args:
|
||||
feature1 (torch.Tensor):
|
||||
(B, N, C) Point-wise features for points cloud 1.
|
||||
feature2 (torch.Tensor):
|
||||
(B, M, C) Point-wise features for points cloud 2.
|
||||
epsilon (torch.Tensor):
|
||||
Entropic regularization.
|
||||
gamma (torch.Tensor):
|
||||
Mass regularization.
|
||||
max_iter (int):
|
||||
Number of iteration of the Sinkhorn algorithm.
|
||||
Returns:
|
||||
T (torch.Tensor):
|
||||
(B, N, M) Transport plan between point cloud 1 and 2.
|
||||
"""
|
||||
if matrix == 'cosine':
|
||||
# Transport cost matrix
|
||||
feature1 = feature1 / torch.sqrt(torch.sum(feature1 ** 2, -1, keepdim=True) + 1e-8)
|
||||
feature2 = feature2 / torch.sqrt(torch.sum(feature2 ** 2, -1, keepdim=True) + 1e-8)
|
||||
C = 1.0 - torch.bmm(feature1, feature2.transpose(1, 2))
|
||||
elif matrix == 'euclidean':
|
||||
distance_matrix = torch.sum(feature1 ** 2, -1, keepdim=True)
|
||||
distance_matrix = distance_matrix + torch.sum(feature2 ** 2, -1, keepdim=True).transpose(1, 2)
|
||||
distance_matrix = distance_matrix - 2 * torch.bmm(feature1, feature2.transpose(1, 2)) # c^2=a^2+b^2-2abcos
|
||||
distance_matrix = distance_matrix ** 0.5
|
||||
# d_max, _ = torch.max(distance_matrix, dim=2, keepdim=True)
|
||||
C = distance_matrix
|
||||
|
||||
# Entropic regularisation
|
||||
K = torch.exp(-C / epsilon) # * support
|
||||
|
||||
# Early return if no iteration
|
||||
if max_iter == 0:
|
||||
return K
|
||||
|
||||
# Init. of Sinkhorn algorithm
|
||||
power = gamma / (gamma + epsilon + 1e-8)
|
||||
a = (torch.ones((K.shape[0], K.shape[1], 1), device=feature1.device, dtype=feature1.dtype) / K.shape[1])
|
||||
prob1 = (torch.ones((K.shape[0], K.shape[1], 1), device=feature1.device, dtype=feature1.dtype) / K.shape[1])
|
||||
prob2 = (torch.ones((K.shape[0], K.shape[2], 1), device=feature2.device, dtype=feature2.dtype) / K.shape[2])
|
||||
|
||||
# Sinkhorn algorithm
|
||||
for _ in range(max_iter):
|
||||
# Update b
|
||||
KTa = torch.bmm(K.transpose(1, 2), a)
|
||||
b = torch.pow(prob2 / (KTa + 1e-8), power)
|
||||
# Update a
|
||||
Kb = torch.bmm(K, b)
|
||||
a = torch.pow(prob1 / (Kb + 1e-8), power)
|
||||
|
||||
# Transportation map
|
||||
T = torch.mul(torch.mul(a, K), b.transpose(1, 2))
|
||||
return T
|
||||
|
||||
|
||||
class UOTHead(nn.Module):
|
||||
|
||||
def __init__(self, nb_iter=5,name='original'):
|
||||
super().__init__()
|
||||
self.epsilon = torch.nn.Parameter(torch.zeros(1)) # Entropic regularisation
|
||||
self.gamma = torch.nn.Parameter(torch.zeros(1)) # Mass regularisation
|
||||
self.nb_iter = nb_iter
|
||||
self.name=name
|
||||
|
||||
def forward(self, batch_dict, src_coords=None, mode='pairs'):
|
||||
|
||||
feats = batch_dict['fea_kpt_'+self.name].squeeze(-1)
|
||||
|
||||
B, C, NUM = feats.shape
|
||||
|
||||
assert B % 2 == 0, "Batch size must be multiple of 2: B anchor + B positive samples"
|
||||
B = B // 2
|
||||
feat1 = feats[:B]
|
||||
feat2 = feats[B:]
|
||||
|
||||
coords = batch_dict['key_points']
|
||||
coords1 = coords[:B, :, 0:3]
|
||||
coords2 = coords[B:, :, 0:3]
|
||||
|
||||
correspondences_feature = sinkhorn_unbalanced(
|
||||
feat1.permute(0, 2, 1),
|
||||
feat2.permute(0, 2, 1),
|
||||
epsilon=torch.exp(self.epsilon) + 0.03,
|
||||
gamma=torch.exp(self.gamma),
|
||||
max_iter=self.nb_iter,
|
||||
matrix='cosine',
|
||||
)
|
||||
|
||||
feature_corr_sum = correspondences_feature.sum(-1, keepdim=True)
|
||||
project_kpts = (correspondences_feature @ coords2) / (feature_corr_sum + 1e-8)
|
||||
project_feas = (correspondences_feature @ feat2.permute(0, 2, 1)) / (feature_corr_sum + 1e-8)
|
||||
|
||||
batch_dict['project_kpts_'+self.name] = project_kpts
|
||||
batch_dict['project_feas_'+self.name] = project_feas.permute(0, 2, 1)
|
||||
# batch_dict['project_coord_kpts'] = project_coord_kpts
|
||||
batch_dict['correspondences_feature_'+self.name] = correspondences_feature
|
||||
# batch_dict['correspondences_coord'] = correspondences_coord
|
||||
|
||||
transformation = compute_rigid_transform(coords1, project_kpts, feature_corr_sum.squeeze(-1))
|
||||
batch_dict['transformation_'+self.name] = transformation
|
||||
|
||||
return batch_dict
|
||||
Reference in New Issue
Block a user